Validating e-mail addresses is hard, and not something which you normally want to do in great detail: while it’s possible to spend a lot of time checking the syntax of an e-mail address, the real measure of whether it’s valid is whether the mail server on that domain accepts it. There is ultimately no way around checking that.
Given that a lot of mail providers implement their own restrictions on the local-part (the bit before the ‘@’) of an e-mail address, an address like !!@gmail.com (which is syntactically valid) probably won’t actually be accepted. So what’s the value in doing syntax checks on e-mail addresses? The value is in catching trivial user mistakes, like pasting the wrong data into an e-mail address field, or making a trivial typo in one.
So, for most use cases, there’s no need to bother with fancy validation: just check that the e-mail address matches the regular expression . That should catch simple mistakes, accept all valid e-mail addresses, and reject some invalid addresses.
Why have I been doing further? Walbottle needs it — I think where one RFC references another is one of the few times it’s necessary to fully implement e-mail validation. In this case, Walbottle needs to be able to validate e-mail addresses provided in JSON files, for its email defined format.
So, I’ve just finished writing a small copylib to validate e-mail addresses according to all the RFCs I could get my hands on; mostly RFC 5322, but there is a sprinking of 5234, 5321, 3629 and 6532 in there too. It’s called libemailvalidation (because naming is hard; typing is easier). Since it’s only about 1000 lines of code, there seems to be little point in building a shared library for it and distributing that; so add it as a git submodule to your code, and use validate.c and validate.h directly. It provides a single function:
size_t error_position; is_valid = emv_validate_email_address (address_to_check, length_of_address_to_check, EMV_VALIDATE_FLAGS_NONE, &error_position); if (!is_valid) fprintf (stderr, "Invalid e-mail address; error at byte %zu\n", error_position);
Fun fact for the day: due to the obs-qp rule, a valid e-mail address can contain a nul byte. So unless you ignore deprecated syntax for e-mail addresses (not an option for programs which need to be interoperable), e-mail addresses cannot be passed around as nul-terminated strings.