php - Using a regular expression to validate an email address -


over years have developed regular expression validates email addresses correctly, assuming don't use ip address server part.

i use in several php programs, , works of time. however, time time contacted having trouble site uses it, , end having make adjustment (most realized wasn't allowing 4-character tlds).

what best regular expression have or have seen validating emails?

i've seen several solutions use functions use several shorter expressions, i'd rather have 1 long complex expression in simple function instead of several short expression in more complex function.

the fully rfc 822 compliant regex inefficient , obscure because of length. fortunately, rfc 822 superseded twice , current specification email addresses rfc 5322. rfc 5322 leads regex can understood if studied few minutes , efficient enough actual use.

one rfc 5322 compliant regex can found @ top of page @ http://emailregex.com/ uses ip address pattern floating around internet bug allows 00 of unsigned byte decimal values in dot-delimited address, illegal. rest of appears consistent rfc 5322 grammar , passes several tests using grep -po, including cases domain names, ip addresses, bad ones, , account names , without quotes.

correcting 00 bug in ip pattern, obtain working , fast regex. (scrape rendered version, not markdown, actual code.)

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

here diagram of finite state machine above regexp more clear regexp enter image description here

the more sophisticated patterns in perl , pcre (regex library used e.g. in php) can correctly parse rfc 5322 without hitch. python , c# can too, use different syntax first two. however, if forced use 1 of many less powerful pattern-matching languages, it’s best use real parser.

it's important understand validating per rfc tells absolutely nothing whether address exists @ supplied domain, or whether person entering address true owner. people sign others mailing lists way time. fixing requires fancier kind of validation involves sending address message includes confirmation token meant entered on same web page address.

confirmation tokens way know got address of person entering it. why mailing lists use mechanism confirm sign-ups. after all, can put down president@whitehouse.gov, , parse legal, isn't person @ other end.

for php, should not use pattern given in validate e-mail address php, right way quote:

there danger common usage , widespread sloppy coding establish de facto standard e-mail addresses more restrictive recorded formal standard.

that no better other non-rfc patterns. isn’t smart enough handle rfc 822, let alone rfc 5322. this one, however, is.

if want fancy , pedantic, implement complete state engine. regular expression can act rudimentary filter. problem regular expressions telling valid e-mail address invalid (a false positive) because regular expression can't handle rude , impolite user's perspective. state engine purpose can both validate , correct e-mail addresses otherwise considered invalid disassembles e-mail address according each rfc. allows potentially more pleasing experience, like

the specified e-mail address 'myemail@address,com' invalid. did mean 'myemail@address.com'?

see validating email addresses, including comments. or comparing e-mail address validating regular expressions.

regular expression visualization

debuggex demo


Comments

Popular posts from this blog

Android layout hidden on keyboard show -

google app engine - 403 Forbidden POST - Flask WTForms -

c - Why would PK11_GenerateRandom() return an error -8023? -