Validate an E-Mail Address with PHP, the Right Way
The Internet Engineering Task Force (IETF) document, RFC 3696, “Application Techniques for Checking and Transformation of Names” by John Klensin, gives several valid e-mail addresses that are rejected by many PHP validation routines. The addresses: Abc\@def@example.com, customer/department=shipping@example.com and !def!xyz%abc@example.com are all valid. One of the more popular regular expressions found in the literature rejects all of them:
"^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)
↪*(\.[a-z]{2,3})$"
This regular expression allows only the underscore (_) and hyphen (-) characters, numbers and lowercase alphabetic characters. Even assuming a preprocessing step that converts uppercase alphabetic characters to lowercase, the expression rejects addresses with valid characters, such as the slash (/), equal sign (=), exclamation point (!) and percent (%). The expression also requires that the highest-level domain component has only two or three characters, thus rejecting valid domains, such as .museum.
Another favorite regular expression solution is the following:
"^[a-zA-Z0-9_.-]+@[a-zA-Z0-9-]+.[a-zA-Z0-9-.]+$"
This regular expression rejects all the valid examples in the preceding paragraph. It does have the grace to allow uppercase alphabetic characters, and it doesn't make the error of assuming a high-level domain name has only two or three characters. It allows invalid domain names, such as example..com.
Listing 1 shows an example from PHP Dev Shed (www.devshed.com/c/a/PHP/Email-Address-Verification-with-PHP/2). The code contains (at least) three errors. First, it fails to recognize many valid e-mail address characters, such as percent (%). Second, it splits the e-mail address into user name and domain parts at the at sign (@). E-mail addresses that contain a quoted at sign, such as Abc\@def@example.com will break this code. Third, it fails to check for host address DNS records. Hosts with a type A DNS entry will accept e-mail and may not necessarily publish a type MX entry. I'm not picking on the author at PHP Dev Shed. More than 100 reviewers gave this a four-out-of-five-star rating.
One of the better solutions comes from Dave Child's blog at ILoveJackDaniel's (ilovejackdaniels.com), shown in Listing 2 (www.ilovejackdaniels.com/php/email-address-validation). Not only does Dave love good-old American whiskey, he also did some homework, read RFC 2822 and recognized the true range of characters valid in an e-mail user name. About 50 people have commented on this solution at the site, including a few corrections that have been incorporated into the original solution. The only major flaw in the code collectively developed at ILoveJackDaniel's is that it fails to allow for quoted characters, such as \@, in the user name. It will reject an address with more than one at sign, so that it does not get tripped up splitting the user name and domain parts using explode("@", $email). A subjective criticism is that the code expends a lot of effort checking the length of each component of the domain portion—effort better spent simply trying a domain lookup. Others might appreciate the due diligence paid to checking the domain before executing a DNS lookup on the network.
IETF documents, RFC 1035 “Domain Implementation and Specification”, RFC 2234 “ABNF for Syntax Specifications”, RFC 2821 “Simple Mail Transfer Protocol”, RFC 2822 “Internet Message Format”, in addition to RFC 3696 (referenced earlier), all contain information relevant to e-mail address validation. RFC 2822 supersedes RFC 822 “Standard for ARPA Internet Text Messages” and makes it obsolete.
Following are the requirements for an e-mail address, with relevant references:
An e-mail address consists of local part and domain separated by an at sign (@) character (RFC 2822 3.4.1).
The local part may consist of alphabetic and numeric characters, and the following characters: !, #, $, %, &, ', *, +, -, /, =, ?, ^, _, `, {, |, } and ~, possibly with dot separators (.), inside, but not at the start, end or next to another dot separator (RFC 2822 3.2.4).
The local part may consist of a quoted string—that is, anything within quotes ("), including spaces (RFC 2822 3.2.5).
Quoted pairs (such as \@) are valid components of a local part, though an obsolete form from RFC 822 (RFC 2822 4.4).
The maximum length of a local part is 64 characters (RFC 2821 4.5.3.1).
A domain consists of labels separated by dot separators (RFC1035 2.3.1).
Domain labels start with an alphabetic character followed by zero or more alphabetic characters, numeric characters or the hyphen (-), ending with an alphabetic or numeric character (RFC 1035 2.3.1).
The maximum length of a label is 63 characters (RFC 1035 2.3.1).
The maximum length of a domain is 255 characters (RFC 2821 4.5.3.1).
The domain must be fully qualified and resolvable to a type A or type MX DNS address record (RFC 2821 3.6).
Requirement number four covers a now obsolete form that is arguably permissive. Agents issuing new addresses could legitimately disallow it; however, an existing address that uses this form remains a valid address.
The standard assumes a seven-bit character encoding, not multibyte characters. Consequently, according to RFC 2234, “alphabetic” corresponds to the Latin alphabet character ranges a–z and A–Z. Likewise, “numeric” refers to the digits 0–9. The lovely international standard Unicode alphabets are not accommodated—not even encoded as UTF-8. ASCII still rules here.
That's a lot of requirements! Most of them refer to the local part and domain. It makes sense, then, to start with splitting the e-mail address around the at sign separator. Requirements 2–5 apply to the local part, and 6–10 apply to the domain.
The at sign can be escaped in the local name. Examples are, Abc\@def@example.com and "Abc@def"@example.com. This means an explode on the at sign, $split = explode("@", $email); or another similar trick to separate the local and domain parts will not always work. We can try removing escaped at signs, $cleanat = str_replace("\\@", "");, but that will miss pathological cases, such as Abc\\@example.com. Fortunately, such escaped at signs are not allowed in the domain part. The last occurrence of the at sign must definitely be the separator. The way to separate the local and domain parts, then, is to use the strrpos function to find the last at sign in the e-mail string.
Listing 3 provides a better method for splitting the local part and domain of an e-mail address. The return type of strrpos will be boolean-valued false if the at sign does not occur in the e-mail string.
Let's start with the easy stuff. Checking the lengths of the local part and domain is simple. If those tests fail, there's no need to do the more complicated tests. Listing 4 shows the code for making the length tests.
Now, the local part has one of two forms. It may have a begin and end quote with no unescaped embedded quotes. The local part, Doug \"Ace\" L. is an example. The second form for the local part is, (a+(\.a+)*), where a stands for a whole slew of allowable characters. The second form is more common than the first; so, check for that first. Look for the quoted form after failing the unquoted form.
Characters quoted using the back slash (\@) pose a problem. This form allows doubling the back-slash character to get a back-slash character in the interpreted result (\\). This means we need to check for an odd number of back-slash characters quoting a non-back-slash character. We need to allow \\\\\@ and reject \\\\@.
It is possible to write a regular expression that finds an odd number of back slashes before a non-back-slash character. It is possible, but not pretty. The appeal is further reduced by the fact that the back-slash character is an escape character in PHP strings and an escape character in regular expressions. We need to write four back-slash characters in the PHP string representing the regular expression to show the regular expression interpreter a single back slash.
A more appealing solution is simply to strip all pairs of back-slash characters from the test string before checking it with the regular expression. The str_replace function fits the bill. Listing 5 shows a test for the content of the local part.
The regular expression in the outer test looks for a sequence of allowable or escaped characters. Failing that, the inner test looks for a sequence of escaped quote characters or any other character within a pair of quotes.
If you are validating an e-mail address entered as POST data, which is likely, you have to be careful about input that contains back-slash (\), single-quote (') or double-quote characters ("). PHP may or may not escape those characters with an extra back-slash character wherever they occur in POST data. The name for this behavior is magic_quotes_gpc, where gpc stands for get, post, cookie. You can have your code call the function, get_magic_quotes_gpc(), and strip the added slashes on an affirmative response. You also can ensure that the PHP.ini file disables this “feature”. Two other settings to watch for are magic_quotes_runtime and magic_quotes_sybase.
The two regular expressions in Listing 5 are appealing because they are relatively easy to comprehend and don't require repetition of the allowable character group, [A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~.-]. Here's a test for you. Why does the character group require two back-slash characters before the forward slash and one back-slash character before the single quote?
One deficiency of the outer test of Listing 5 is that it passes local part strings that include dots anywhere in the string. Requirement number two states that dots can't start or end the local part, and they can't appear together two or more times. We could address this by expanding the outer regular expression into form ^(a+(\.a+)+)$, where a is (\\\\.|[A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~-]). We could, but that leads to a long, hard-to-read, repetitive expression that's difficult to believe in. It's clearer to add the simple checks shown in Listing 6.
The local part is a wrap. The code now checks all local part requirements. Checking the domain will complete the e-mail validation. The code could check all of the labels in the domain separately, as does the whiskey-loving code shown in Listing 2, but, as hinted earlier, the solution presented here allows the DNS check to do most of the domain validation work.
Listing 7 makes a cursory check to ensure only valid characters in the domain part, with no repeated dots. It goes on to make DNS lookups for MX and A records. It makes the check for the A record only if the MX record check fails. The code in Listing 4 verified the length of the domain value.
So, is it good? You decide. But, it would be nice to test the logic to ensure that it at least is correct. Listing 8 contains a series of e-mail address test cases that any e-mail validation should pass.
Be sure to run the test to see the valid and rejected e-mail addresses, the double-escaping (\\) inside the PHP strings tends to obfuscate the addresses. You're challenged to subject your favorite e-mail validation code to this test. Be assured that the code in Listing 9 does pass!
Listing 9 contains a complete function for validating an e-mail address. It isn't as concise as many—it certainly isn't a one-liner. But, it is straightforward to read and comprehend, and it correctly accepts and rejects e-mail addresses that many other published functions incorrectly reject and accept. The function orders the validation tests roughly according to increasing cost. In particular, the more complex regular expression and, certainly, the DNS lookup, both come last.
Spread the word! There is some danger that common usage and widespread sloppy coding will establish a de facto standard for e-mail addresses that is more restrictive than the recorded formal standard. If you want to fool the spambots, adopt an e-mail address like, {^c\@**Dog^}@cartoon.com. Unfortunately, you might fool some legitimate e-commerce sites as well. Which do you suppose will adapt more quickly?
Submitted by Douglas Lovell on Fri, 2007-06-01 01:00.

Delicious
Digg
StumbleUpon
Reddit
Facebook
need to update the link to ilovejackdaniels
ilovejackdaniels.com will die soon, the new link is here:
http://www.addedbytes.com/php/email-address-validation
I agree: too much DNS, and is too slow
First off,
DNS lookups are quite slow, but without them this script will validate many obviously wrong addresses (like "a@b").
With a google search, I found this function which is much faster and it does a pretty good job, especially for not relying on the DNS at all...
function emailcheck($email) { return preg_match('/^(?:[\w\!\#\$\%\&\'\*\+\-\/\=\?\^\`\{\|\}\~]+\.)*[\w\!\#\$\%\&\'\*\+\-\/\=\?\^\`\{\|\}\~]+@(?:(?:(?:[a-zA-Z0-9_](?:[a-zA-Z0-9_\-](?!\.)){0,61}[a-zA-Z0-9_]?\.)+[a-zA-Z0-9_](?:[a-zA-Z0-9_\-](?!$)){0,61}[a-zA-Z0-9_]?)|(?:\[(?:(?:[01]?\d{1,2}|2[0-4]\d|25[0-5])\.){3}(?:[01]?\d{1,2}|2[0-4]\d|25[0-5])\]))$/', $email); }Also,
instead of:
preg_match('/\\.\\./', $local)
use:
strpos($local,'..')!==false)
strpos() is much faster than preg_match.
Relies too much on DNS
DNS checking is resource intensive. If you take out the DNS checks, all of the bugs in the logic start to appear.
This routine will accept the following as a valid email address:
a@b.c
a@b
a@b.
It is doing zero validation of the host and domain extension. Instead it relies on DNS to do these checks, which is a waste of system resources.
Since this article was posted over two years ago and no corrections have been made, I would suggest looking for something better.
Awesome script, just what I was looking for!!
Great script with comments and explanations so you can learn and understand what the code is doing!! Excellent!!
sorry this code have a
sorry this code have a bug
just test : michael.good@gmail
wihout any .COM or .ANYTHING , the function recognize the email as valid !!!
It is prob due to your dns
It is prob due to your dns search suffix, is it set to .com? if so its appending .com onto domains that are not fqdn
Thanks mate, this is working
Thanks mate, this is working just perfectly for me ;-)
Thinks don't have to be perfect in my oppinion, as long as some scumbag spambots cannot spam with viagra_pills@for.free i'm happy :-D
This will reject 'postmaster'.
These validations will reject 'postmaster' which is, at least in some circumstances such as an SMTP RCPT line, required to be considered a valid email address.
I'm a little late, but I think I found a bug...
Hi there,
I am using this code to validate emails on the fly with the help of AJAX, on my site. I noticed that as I entered random email addresses the user could simply put
"myemail@a"
and it would consider it valid.
it didn't need the .____ attached.
so I added this to the middle of the validation:
else if (!preg_match('/\\./', $domain))
{
// domain has no dots
$isValid = false;
}
It fixed it.
Thanks for the code :)
I wrote my own e-mail
I wrote my own e-mail validation function after spending hours with RFC 2822. It passes all of the above test cases with NOYB's corrections. I would appreciate if you submit any bugs to: geoffreyj.lee at Gmail.
function validateEmail($input) { $atom = '[a-zA-Z0-9!#$%&\'*+\-\/=?^_`{|}~]+'; $quoted_string = '"[^"\\\\\r\n]*"'; $word = "$atom(\.$atom)*"; $domain = "$atom(\.$atom)+"; return strlen($input) < 256 && preg_match("/^($word|$quoted_string)@{$domain}\$/", $input); }I realized that my
I realized that my quoted-string regexp allowed too many characters. Here's the corrected version:
function validateEmailAddress($input) { $atom = '[a-zA-Z0-9!#$%&\'*+\-\/=?^_`{|}~]+'; $quoted_string = '"([\x1-\x9\xB\xC\xE-\x21\x23-\x5B\x5D-\x7F]|\x5C[\x1-\x9\xB\xC\xE-\x7F])*"'; $word = "$atom(\.$atom)*"; $domain = "$atom(\.$atom)+"; return strlen($input) < 256 && preg_match("/^($word|$quoted_string)@${domain}\$/", $input); }Question
Hello guys
i have connection internet and i installed xampp last version i would like to test validation of email this is the form :
Enter your email :
but its not working plz can u help me
replay urgent
just thought i'd leave a
just thought i'd leave a note to say your link to Dave Child's website, ilovejackdaniels.com has changed to addedbytes.com
Wee fix
This should check the top-level domain as well. I thought everything is fine until someone typed email address like this xxx@yyy.p instead of xxx@yyy.pl . DNS checking is switched off on a server so I cannot validate email address using it.
Lets add:
elseif (strlen(substr($domain, strrpos($domain, '.')+1)) < 2 || strlen(substr($domain, strrpos($domain, '.')+1)) > 6) {
$isValid = false;
}
Top level domain AFAIK is at least 2 char long and 'museum' is longest at the moment. This should do the trick.
Thanks for this code BTW! Very useful.
Wrong start!
$isValid = true;
Should be $isValid = false;
If everything fails it should allways return false... pfff, spread the word!
R U Stupid?
The beginning of the code is perfect, "$isValid = true;" and not what you said!
It starts with the idea that the email is valid, and the checks are made!
If it doesn't pass, then it will return false!
He has a good point and not stupidity...
the first set should be false...
to prove that the email is true (and only true) is to pass through tests.. after all tests have been done and all passed, that's the only time you set and agree that the email is valid. It is rather right than saying, at first the email is already valid and go through all the tests and prove it wrong.
What about other languages?
Great article! Everything was neatly explained. I would say that Tom Burt is right though, the wording in the rule section seems to imply that a domain name can not begin with a number, which of course is wrong.
But what about if I wanted to do it in other languages like ASP.NET, or just plain javascript, is there any chance you will be working on examples for that too? :)
This doesn't work with
This doesn't work with emails such as "someone@somewhere.co.uk" or "someone@somewhere.mn" ....
A validator that can tell back the exact nature of the anomaly.
Hello!
First, I want to thank and congratulate the author of this article for its quality and desirability.
I admit that, wishing to write a form that tests the submited addresses, I have been searching for a long time in vain on the Web a document that clearly explains the email addresses syntax, and that it seems I found it here.
I want to write a mail addresses validator that can tell back the visitor the exact nature of the anomaly. Furthermore, I don't want to use regular expressions, often reading bad things about them, and... not knowing how to use them.
So I show you candidly the code I wrote, for submission to the fire of your critics. It is not perfect: particularly in the management of the escapement.
Here you have:
<?php // testor_email_0.php $testable_mail = html_entity_decode( $mail ) ; $butee = strlen( $testable_mail ) ; $aro_pos = strrpos( $testable_mail, $aro ) ; $nout = array( "nom d'utilisateur", "user name" ) ; $nodo = array( "nom de domaine", "domain name" ) ; $car = array( "caractère", "character" ) ; $et = "être" ; if( ! $testable_mail ) $avertissement = array( "$viv $adel[$lang_index]", "$svps[$lang_index] include your $adel[$lang_index]" ) ; else if ( $aro_pos === FALSE ) $avertissement = array( "Votre $adel[$lang_index] doit comporter une arobase", "Your $adel[$lang_index] must include the at sign" ) ; else if ( $aro_pos == 0 ) $avertissement = array( "Votre $adel[$lang_index] doit comporter un $nout[$lang_index]", "Your $adel[$lang_index] must have a $nout[$lang_index]" ) ; else if ( $aro_pos == $butee - 1 ) $avertissement = array( "Votre $adel[$lang_index] doit comporter un $nodo[$lang_index]", "Your $adel[$lang_index] must have a $nodo[$lang_index]" ) ; else if( $testable_mail{0} == $dot ) $avertissement = array( "Un point ne peut pas débuter votre $adel[$lang_index]", "A dot cannot begin your $adel[$lang_index]" ) ; else if( $testable_mail{$butee - 1} == $dot ) $avertissement = array( "Un point ne peut pas terminer votre $adel[$lang_index]", "A dot cannot end your $adel[$lang_index]" ) ; else { $segments = explode( $dot, $testable_mail ) ; foreach( $segments as $segment ) if( ! strlen( $segment ) ) { $avertissement = array( "Deux points ne peuvent pas $et contigus dans votre $adel[$lang_index]", "Two dots can not be contiguous in your $adel[$lang_index]" ) ; break ; } include_once "Data/hilite.php" ; $numeri_cars = range( "0", "9" ) ; if( ! $avertissement ) include "testor_email_1.php" ; if( ! $avertissement ) include "testor_email_2.php" ; } if( $avertissement ) $a_servir = "mail" ; ?><?php // testor_email_1.php $testable_str = substr( $testable_mail, 0, $aro_pos ) ; $butee = strlen( $testable_str ) ; $gui = '"' ; $gui_nombre = substr_count( $testable_str, $gui ) ; $dir = dir_extraire( __file__ ) ; /* longueur maximum */ $max = 64 ; if( $butee > $max ) $avertissement = array( "Le nombre de car[$lang_index]s de votre $nout[$lang_index] ne peut excéder $max", "The number of car[$lang_index]s in your $nout[$lang_index] can not exceed $max" ) ; /* point a la fin */ else if( $testable_str{ $butee - 1 } == $dot ) $avertissement = array( "Un point ne peut $et contigu à l'arobase", "A dot cannot be contiguous to the at sign" ) ; /* guillemets */ else if( $gui_nombre ) include "$dir/testor_email_guillemets.php" ; /* defaut */ else { $zauts_str = "!, #, $, %, &, ', *, +, -, /, =, ?, ^, _, `, {, |, }, ~, $dot" ; $zauts_list = explode( $vs, $zauts_str ) ; $valides = array_merge( $lettres, $numeri_cars, $zauts_list ) ; for( $i = 0; $i < $butee; $i++ ) { $ze_car = $testable_str{$i} ; if( ! in_array( $ze_car, $valides ) ) { $ze_car = hilite( $ze_car ) ; $avertissement = array( "Le $car[$lang_index] $ze_car ne peut pas figurer dans votre $nout[$lang_index]", "The $car[$lang_index] $ze_car cannot appear in your $nout[$lang_index]" ) ; break ; } } } ?><?php // testor_email_guillemets.php if( $gui_nombre == 1 ) $avertissement = array( "Les guillemets doivent se présenter par paire dans votre $nout[$lang_index]", "The double-quotes must show by pair in your $nout[$lang_index]" ) ; else if( $gui_nombre == 2 ) { if( $testable_str{0} != $gui || $testable_str{ $butee - 1 } != $gui ) $avertissement = array( "Les guillemets doivent se présenter aux extrémités de votre $nout[$lang_index]", "The double-quotes must show at the ends of your $nout[$lang_index]" ) ; } else $avertissement = array( "Votre $nout[$lang_index] ne peut pas avoir de guillemets ailleurs qu'aux deux extrémités", "Your $nout[$lang_index] cannot have double-quotes anywhere else but at both ends" ) ; ?><?php // testor_email_2.php $testable_str = substr( $testable_mail, $aro_pos + 1 ) ; $butee = strlen( $testable_str ) ; $max = 255 ; if( $butee > $max ) $avertissement = array( "Le nombre de $car[$lang_index]s de votre $nodo[$lang_index] ne peut excéder $max", "The number of $car[$lang_index]s in your $nodo[$lang_index] can not exceed $max" ) ; else if( $testable_str{0} == $dot ) $avertissement = array( "Le premier $car[$lang_index] de votre $nodo[$lang_index] ne peut pas $et un point", "Your $nodo[$lang_index] can not begin with a dot" ) ; else { $segments = explode( $dot, $testable_str ) ; foreach( $segments as $segment ) { $segment_len = strlen( $segment ) ; $max = 63 ; if( $segment_len > $max ) { $avertissement = array( "Le nombre de $car[$lang_index]s entre deux points dans votre $nodo[$lang_index] ne peut excéder $max", "The number of $car[$lang_index]s between two dots in your $nodo[$lang_index] can not exceed $max" ) ; break ; } $ze_car = $segment{0} ; if( ! in_array( $ze_car, $lettres ) ) { $ze_car = hilite( $ze_car ) ; $avertissement = array( "Le $car[$lang_index] $ze_car ne peut pas figurer immédiatement après un point ou l'arobase dans le $nodo[$lang_index] de votre $adel[$lang_index]", "The $car[$lang_index] $ze_car cannot show just after a dot or the at sign in the $nodo[$lang_index] of your $adel[$lang_index]" ) ; break ; } $valides = array_merge( $lettres, $numeri_cars ) ; $ze_car = $segment{$segment_len-1} ; if( ! in_array( $ze_car, $valides ) ) { $ze_car = hilite( $ze_car ) ; $avertissement = array( "Le $car[$lang_index] $ze_car ne peut figurer : ni immédiatement avant un point dans le, ni à la fin du, $nodo[$lang_index] de votre $adel[$lang_index]", "The $car[$lang_index] $ze_car cannot show, neither just before a dot in, nor at the end of, the $nodo[$lang_index] of your $adel[$lang_index]" ) ; break ; } $valides[] = $tiret ; $butee = $segment_len - 1 ; for( $i = 1; $i < $butee; $i++ ) { $ze_car = $segment{$i} ; if( ! in_array( $ze_car, $valides ) ) { $ze_car = hilite( $ze_car ) ; $avertissement = array( "Le $car[$lang_index] $ze_car ne peut pas figurer dans votre $nodo[$lang_index]", "The $car[$lang_index] $ze_car cannot appear in your $nodo[$lang_index]" ) ; break 2 ; } } } } if( ! $avertissement ) { $dot_pos = strrpos( $testable_str, $dot ) ; if( $dot_pos === FALSE ) $avertissement = array( "Veuillez indiquer un domaine de niveau supérieur à votre $adel[$lang_index]", "$svps[$lang_index] indicate a top level domain to your $adel[$lang_index]" ) ; if( ! $avertissement ) { $tld = substr( $testable_str, $dot_pos + 1 ) ; include "../Data/tlds.php" ; if( ! in_array( $tld, $tlds ) ) $avertissement = array( "Le domaine de niveau supérieur que vous avez indiqué ne figure pas dans notre liste de référence", "The top level domain you indicate is not in our list" ) ; if( ! $avertissement && ! checkdnsrr( $testable_str ) && ! checkdnsrr( $testable_str, "A" ) ) $avertissement = array( "Le $nodo[$lang_index] que vous avez indiqué n'est pas reconnu par internet", "The $nodo[$lang_index] you indicate is not recognized by internet" ) ; } } ?>Thank you for your contribution,
Sacapuss
I appreciate
I appreciate your efforts in producing a comprehensive email validation function for php.
Yet Another Email Address Validator
I've had a go at this too. One reason being that the code here is All Rights Reserved by Linux Journal, so I don't think you can use it in your project.
Here's my effort: RFC-compliant email address validator
I've done more checking of the domain part, particularly allowing the IP address format even though it's discouraged by the RFCs.
I believe my function respects RFCs 1123, 2396, 3696, 4291, 4343, 5321 & 5322. Please let me know if you find any problems with it.
PHP 4.0.0 Update
The line:
if (is_bool($atIndex) && !$atIndex) {
can now be updated to read:
if ($atIndex === false) {
This looks a little cleaner, but may be harder to read or confuse older PHP developers.
It can be better written this way:
<?php # Offers methods for validating user input class Validate { static function email($email) { $isValid = true; $atIndex = strrpos($email, '@'); if (is_bool($atIndex) && !$atIndex) { return false; } else { $domain = substr($email, $atIndex + 1); $local = substr($email, 0, $atIndex); $validLocalLength = Validate::length($local, 1, 64); $validDomainLength = Validate::length($domain, 1, 255); $validStartFinish = !($local[0] == '.' || $local[$localLen - 1] == '.'); $validLocalDots = !preg_match('/\\.\\./', $local); $validDomainCharacters = preg_match('/^[A-Za-z0-9\\-\\.]+$/', $domain); $validDomainDots = !preg_match('/\\.\\./', $domain); $validLocalCharacters = !(!preg_match('/^(\\\\.|[A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~.-])+$/', str_replace("\\\\","",$local)) && !preg_match('/^"(\\\\"|[^"])+"$/', str_replace("\\\\","",$local))); $validMailRecord = checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A'); return $validLocalLength && $validDomainLength && $validStartFinish && $validLocalDots && $validDomainCharacters && $validDomainDots && $validLocalCharacters && $validMailRecord; } } static function length($input, $min, $max) { return isset($input[$min - 1]) && !isset($input[$max]); } } ?>fix
Replace the line:
$validStartFinish = !($local[0] == '.' || $local[$localLen - 1] == '.');
with:
$validStartFinish = !($local[0] == '.' || $local[strlen($local) - 1] == '.');
since $localLen isn't defined
Failed Verification
I like how you wrote the code: I did a test on the emails that were in the article and your script worked fine except it said the following emails where valid when they should have been invalid
dot.@example.com
Doug\ \"Ace\"\ L\.@example.com
function validEmail:
I've also tried this function below to replace checkdnsrr because it doesn't work at all in windows, but still not working, page always keep on loading and nothing displayed:
function myCheckDNSRR($hostName, $recType = ''){ if(!empty($hostName)) { if( $recType == '' ) $recType = "MX"; exec("nslookup -type=$recType $hostName", $result); // check each line to find the one that starts with the host // name. If it exists then the function succeeded. foreach ($result as $line) { if(eregi("^$hostName",$line)) { return true; } } // otherwise there was no mail handler for the domain return false; } return false; }This is from PHP Mail Validator
Hei Yo
validEmail not trapping invalid domains correctly
I appreciate your efforts in producing a comprehensive email validation function for php. Unfortunately, when I tried to implement and test this function, it does not appear to invalid domains correctly. For example:
echo(_valid_email('autoit_heidi@yahoo.com')); (valid) Returns true
echo(_valid_email('autoit_heidi@yahoo.co')); (invalid domain) Returns true
echo(_valid_email('autoit_heidi@111111111111111111.com')); (invalid domain) Returns true
Is this the more current code?
quotation marks
I used the functionality given in this article in a test case, and emails with quotation marks, with both embedded and without (the embedded ones had proper escape characters) both failed the verification standards...either there must be an update...or someone is lying
"abc@def"@example.com doesnt
"abc@def"@example.com
doesnt work
and
"Fred \"quota\" Bloggs"@example.com
doesnt work...if its supposed to, why isnt it?
bump
bump.
But seriously is this gonna get an update for the problem whereas the domain part of an email adress is not allowed to start with a number and yet the function allows it?
domains are allowed to start with a digit
not sure exactly when they were allowed, but domains can start with a digit.
filter_var
I suppose a simple
filter_var($email, FILTER_VALIDATE_EMAIL);
isn't enough ?
FILTER_VALIDATE_EMAIL
I belive this function only work for php5 or above
filter_var isn't perfect either
Yes, this function was introduced in PHP 5.2, and it isn't as comprehensive. A test of filter_var in PHP 5.3 gives:
All of these should succeed:
dclo@us.ibm.com is valid.
abc\@def@example.com is not valid.
abc\\@example.com is not valid.
Fred\ Bloggs@example.com is not valid.
Joe.\\Blow@example.com is not valid.
"Abc@def"@example.com is valid.
"Fred Bloggs"@example.com is valid.
customer/department=shipping@example.com is not valid.
$A12345@example.com is valid.
!def!xyz%abc@example.com is valid.
_somename@example.com is valid.
user+mailbox@example.com is valid.
peter.piper@example.com is valid.
Doug\ \"Ace\"\ Lovell@example.com is not valid.
"Doug \"Ace\" L."@example.com is not valid.
All of these should fail:
abc@def@example.com is not valid.
abc\\@def@example.com is not valid.
abc\@example.com is not valid.
@example.com is not valid.
doug@ is not valid.
"qu@example.com is not valid.
ote"@example.com is not valid.
.dot@example.com is not valid.
dot.@example.com is valid.
two..dot@example.com is valid.
"Doug "Ace" L."@example.com is not valid.
Doug\ \"Ace\"\ L\.@example.com is not valid.
hello world@example.com is not valid.
gatsby@f.sc.ot.t.f.i.tzg.era.l.d. is not valid.
The email validation is deficient.
Updates?
I'd really like to see this article updated in response to some of these comments. Particularly, NOYB and the concerns about IP address domains (even if the given examples are incorrect).
Other updates I'd like to see include:
For now, I'm using your function with a few modifications (including implementing a "trust scale" of 0.0-1.0 instead of an absolute true/false), but my quest for One Email Validator to Rule Them All continues. It'd be awesome if we could somehow get to a point where we didn't need to send any annoying confirmation emails. All in all, great work.
Great code
This is very nice routine once for all. Currently i m doing testing on window machine and windows doesnt support checkdnsrr function so i modify it following way to work with Window.
/* Following code should be activated if hosting is on linux. if ($isValid && !(checkdnsrr($domain,"MX") || checkdnsrr($domain,"A"))) { // domain not found in DNS $isValid = false; } Following code should be activated if hosting is on windows. */ if ($isValid && !(myCheckDNSRR($domain,"MX") || myCheckDNSRR($domain,"A"))) { // domain not found in DNS $isValid = false;function myCheckDNSRR($hostName, $recType = '') { if(!empty($hostName)) { if( $recType == '' ) $recType = "MX"; exec("nslookup -type=$recType $hostName", $result); // check each line to find the one that starts with the host // name. If it exists then the function succeeded. foreach ($result as $line) { if(eregi("^$hostName",$line)) { return true; } } // otherwise there was no mail handler for the domain return false; } return false; }And please pardon my knowledge, I am very new in programming and just trying to play with it, its not my code i found from other places. But I thought it will help.
Thanks
3.4. Address Specification
It was good for me to read about you wanna did this formal right once and for all. I really appreciate this. BUT. Let's take RFC8222 and checkout what exactly an adress is:
http://www.faqs.org/rfcs/rfc2822.html
3.4. Address Specification
Addresses occur in several message header fields to indicate senders
and recipients of messages. An address may either be an individual
mailbox, or a group of mailboxes.
address = mailbox / group
mailbox = name-addr / addr-spec
name-addr = [display-name] angle-addr
angle-addr = [CFWS] "<" addr-spec ">" [CFWS] / obs-angle-addr
group = display-name ":" [mailbox-list / CFWS] ";"
[CFWS]
display-name = phrase
mailbox-list = (mailbox *("," mailbox)) / obs-mbox-list
address-list = (address *("," address)) / obs-addr-list
So there I would really like to see you routine to be OK with this defitinion. For Example angle-addr as part of mailbox is not really supported. Your routine does not even check for the right mailbox definition in an address. I have not checked wether groups are. Isn't this the right place to look for the definition of an email-adress?
I think you are taking the
I think you are taking the wrong section. What that seems to describe is the way addresses are written in headers and such.
That is something like:
Julius Caesar
What the article is about is the addr-spec, that is the part between angle brackets.
Validate an E-Mail Address with PHP... (Javascript version)
Please replace:
1. strEmail[j] with strEmail.charAt(j)
2. local[0] with local.charAt(0)
3. local[localLen-1] with local.charAt(localLen-1)
4. domain[domainLen-1] with domain.charAt(domainLen-1)
because "strEmail[j]" did not work on ie.
Validate an E-Mail Address with PHP... (Javascript version)
//NOTE: use this line code : // strEmail= fixBackSlash(strEmail); // only if email address come from a textbox (form); function isValidEmail(strEmail) { this.strrpos=function( haystack, needle, offset){ // http://kevin.vanzonneveld.net // + original by: Kevin van Zonneveld (http://kevin.vanzonneveld.net) // * example 1: strrpos('Kevin van Zonneveld', 'e'); // * returns 1: 16 var i = haystack.lastIndexOf( needle, offset ); // returns -1 return i >= 0 ? i : false; } this.fixBackSlash=function(strEmail) { var strEmailTemp=""; var isBackSlash = false; for(var j=0;j 64) { // local part length exceeded isValid = false; } else if (domainLen < 1 || domainLen > 255) { // domain part length exceeded isValid = false; } else if (local[0] == '.' || local[localLen-1] == '.') { // local part starts or ends with '.' isValid = false; } else if (local.match('\\.\\.')) { // local part has two consecutive dots isValid = false; } else if (!domain.match('^[A-Za-z0-9\\-\\.]+$')|| domain[domainLen-1] == '.') { // character not valid in domain part isValid = false; } else if (domain.match('\\.\\.')) { // domain part has two consecutive dots isValid = false; } else if(!localsave.match('^(\\\\.|[A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~.-])+$')) { // character not valid in local part unless // local part is quoted if (!localsave.match('^"(\\\\"|[^"])+"$')) { isValid = false; } } } return isValid; }Many other valid emails still fail
Here's a few examples of valid email addresses that fail using this validator:
localhost
joe@localhost
ipv4
joe@123.456.7.89
ipv6
joe@2001:0db8::1428:57ab
Your Examples
joe@123.456.7.89 is not valid, each byte of an IPv4 address can only range 0-255 decimal (and 456 is outside of this range).
Also, I didn't test your addresses, but the domains have to be registered, otherwise the DNS lookup (checkdnsrr) will fail.
Nice!
Excellent, I will use it in my systems.
Validate an E-Mail Address with PHP, the Right Way
thank you my page used..
JavaScript conversion...
Hi!
I've found your PHP script very effective, so I tried to convert it to JavaScript to check an address before it's sent to the server and if necessary warn the user.
It was very easy even if I'm not an expert programmer, but I have some problems with the last "else if" statement, cause it misses the recognition of the following addresses: abc\@def@example.com, Fred\ Bloggs@example.com, Doug\ \"Ace\"\ Lovell@example.com, "Doug \"Ace\" L."@example.com, abc\@example.com (this should fail but it doesn't)
The code i used is:
else if (!local.replace("\\\\","").match(/^(\\\\.|[A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~.-])+$/))
{
// character not valid in local part unless
// local part is quoted
if (!local.match(/^"(\\\\"|[^"])+"$/))
{
isValid = false;
}
}
What I'm missing?
Thanks for any help!
Validate an E-Mail Address with PHP... (Javascript version)
//NOTE: use this line code : // strEmail= fixBackSlash(strEmail); // only if email address come from a textbox (form); function isValidEmail(strEmail) { this.strrpos=function( haystack, needle, offset){ // http://kevin.vanzonneveld.net // + original by: Kevin van Zonneveld (http://kevin.vanzonneveld.net) // * example 1: strrpos('Kevin van Zonneveld', 'e'); // * returns 1: 16 var i = haystack.lastIndexOf( needle, offset ); // returns -1 return i >= 0 ? i : false; } this.fixBackSlash=function(strEmail) { var strEmailTemp=""; var isBackSlash = false; for(var j=0;j 64) { // local part length exceeded isValid = false; } else if (domainLen < 1 || domainLen > 255) { // domain part length exceeded isValid = false; } else if (local[0] == '.' || local[localLen-1] == '.') { // local part starts or ends with '.' isValid = false; } else if (local.match('\\.\\.')) { // local part has two consecutive dots isValid = false; } else if (!domain.match('^[A-Za-z0-9\\-\\.]+$')|| domain[domainLen-1] == '.') { // character not valid in domain part isValid = false; } else if (domain.match('\\.\\.')) { // domain part has two consecutive dots isValid = false; } else if(!localsave.match('^(\\\\.|[A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~.-])+$')) { // character not valid in local part unless // local part is quoted if (!localsave.match('^"(\\\\"|[^"])+"$')) { isValid = false; } } } return isValid; }Great article, just a slight fix
This is terrific.
There is some sort of typo in the part of the code in Listing 9 where you check the A and MX DNS records, which make this break as written.
Changing:
if ($isValid && !(checkdnsrr($domain,"MX") ||
↪checkdnsrr($domain,"A")))
To:
if ($isValid && !((checkdnsrr($domain,"MX")) ||
(checkdnsrr($domain,"A"))))
seems to make it work.
your fix works for me too
Thanks for the awesome script!
I ran into the same error with that line, and your fix made it work for me too!
Post new comment