While DotNetNuke 4.4.x already validates emails in the user profile with a regular expression, this regular expression could not be changed by portal admins. In DNN 4.5, the regular expression used to validate email address can be altered in the User Settings Page (Admin > User Accounts > User Settings)
The default regular expression is:
\b[a-zA-Z0-9._%-+']+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b
And while this catches a lot of errors already, there still are some email address that will wrongfully validate, while some other valid addresses will not. For example: erik.vanballegoij@.dotnetnuke.com will validate, as will erik.vanballegoij@dotnetnuke..com, while both are not valid (punctuation errors). Also, erik@somemuseum.museum will not validate, even though it is perfectly valid.
I've found the article by Jan Goyvaerts (http://www.regular-expressions.info/email.html) very informative when it comes to validating email addresses with regular expressions. Charles Nurse probably used the same article as inspiration, as the DNN regex closely resembles the first sample on that page. Using a few of the hints in that article led me to construct the following regex:
^[a-zA-Z0-9._%-+']+@(?:[a-zA-Z0-9-]+\.)+(?:[A-Z]{2}|com|org|net|biz|info|name|aero|biz|info|jobs|museum|name)$
^[a-zA-Z0-9._%\-+']+@(?:[a-zA-Z0-9\-]+\.)+(?:[a-zA-Z]{2}|com|org|net|biz|info|name|aero|jobs|museum)$
^[a-zA-Z0-9._%\-+']+@(?:[a-zA-Z0-9\-]+\.)+(?:[a-zA-Z]{2}|aero|arpa|asia|biz|cat|com|coop|edu|gov|info|int|jobs|mil|mobi|museum|name|net|org|pro|root|tel|travel|cym|geo|post)$
(Thanks Gilles for letting me know there were a few errors in the regex.. thats what you get when copy-pasting ...) Which will correctly invalidate adresses with dot after @ and with consecutive dots. Also, all extra top level domain names will be validated.
Now, I do not consider my self a regex expert in any way, in fact, I need tools to understand them at all. I like Expresso myself, but I've also looked at RegexBuddy and The Regulator.
Now.. there's only one question left: what if i wanted to exclude some domains, like yahoo, hotmail etc. I've been trying to construct the proper regex but failed misserably ... so if anyone out there knows how to add domain exclusion.. that would be greatly appreciated!
For a full list of generic top level domains, see here: http://www.iana.org/gtld/gtld.htm
[update November 15:] thanks swift51 and dukesb11 for the extra info! [/update]