Runtime: TryParse for email addresses

Created on 16 Nov 2017  路  47Comments  路  Source: dotnet/runtime

Currently we have to write this code:

```c#
public static bool IsEmailAddress(this string value)
{
try
{
new System.Net.Mail.MailAddress(value);
return true;
}
catch ()
{
return false;
}
}

MailAddress should have a TryParse method like Int32.TryParse.

## Rationale and Usage

Throwing and catching exceptions can have a significant impact on performance. So when possible there should be a non-throwing alternative for methods that commonly fail. An example would be `Int32.Parse` and `Int32.TryParse`.

There is also an impact on debugger. If you have the IDE set to "Break on All Exceptions" in order to see where an exception is being thrown, it will also catch this even though it is not interesting.

## Proposed API

```csharp
class MailAddress {
    public static MailAddress Parse(string address);
    public static MailAddress Parse(string address, string displayName);
    public static MailAddress Parse(string address, string displayName, Encoding displayNameEncoding);

    public static bool TryParse(string address, out MailAddress result);
    public static bool TryParse(string address, string displayName, out MailAddress result);
    public static bool TryParse(string address, string displayName, Encoding displayNameEncoding, out MailAddress result);
}

Details

Open Questions

api-approved area-System.Net

Most helpful comment

To put it in simpler way MailAddress.TryParse methods ar not for checking if email is valid, but for checking if given string is in a format, which represents email address.

All 47 comments

Can you post full API proposal? (see API review process)

How's that look?

See the link I posted - it has details and also example. The key thing is that API should be thought through various points of views (usage patterns, similar patterns in BCL, etc.)

Ah, I didn't realize you updated top post.

My apologies, I see how my last comment was a bit ambiguous.

@Grauenwolf Have you tried writing an extension class to accomplish this task?

Alternatively, why not make the MailAddressParser which the MailAddress uses under the covers not internal?

https://github.com/dotnet/corefx/blob/f0ccd9742065ee2aa296448fe3dd38ef6ff77af9/src/Common/src/System/Net/Mail/MailAddressParser.cs#L21

As long as it has a TryParse method, I have no preference.

@euclid47
Own extension is not enough, because it will still have the issues @Grauenwolf mentioned in his rationale: impact on performance because of thrown exception and troubles in IDE when you use _Break on All Exceptions_. The other disadvantage of extension is, that you have to have an instance already and this should be static methods as shown in proposed API.

@ctolkien
This probably will not work so easily. My opinion is, that the implementation of TryParse should not throw any exceptions even internally because od performance (of course arguments checking and thus some Argument[Null]Exception is OK).

I agree with @Grauenwolf that TryParse would be a very nice method to have. We also use the try-catch block just to check if string is valid email.

The only way to discover if an email is valid is to send an email to it and see what happens. Just because it looks like an email address does not mean it's valid. When you TryParse on a number the result indicates that the value parsed is, or is not, a number. TryParse for email addresses cannot do this.

As people make security decisions on email addresses I would not like to see this added.

(As an aside I wish System.Net.Mail would just go away).

We also use the try-catch block just to check if string is valid email.

My experience is that the only real way to check if an email is valid is to send an email to it.

Edit: gah beaten.

That's like arguing we shouldn't parse someone's date of birth because the only way to know if the DOB is valid is to check the birth certificate.

If using for UI validation, you shouldn't wait for a bounce to inform the user they typed "me#google.com".

Personally I'm not even using it for sending emails. I'm using it to extract the user and host name as separate fields for LDAP lookups. (Which also proves the email address is valid without sending an email.)

On another project I need to read the top 10,000 rows from each table and see which non-null columns always, never, or sometimes contain email addresses.

Again, I need an email address parser. But I'm not going to be actually sending out emails.


Finally, how would you use an email address alone for security? Even if you did send a validation email, it could be to a fake address set up for that purpose.

You were given an email parser in your Stack Overflow question, so what more do you need really? I'm in agreement with @blowdart here, not keen on adding extra APIs to parse email addresses, especially as the rules for what a valid address has the potential to change, meaning the API would go out of date.

I fail to see the logic in your argument. Class MailAddress isn't going away. In the unlikely event that the definition of an email address changes, it will have to be updated anyways.

And no, a Regex is not a substitute for a standards compliant parser.

@Grauenwolf For both questions I would ask these on stackoverflow.com due to being off topic. Both questions are important and I have a few ideas to accomplish both tasks.

@blowdart, @ctolkien, @WiredUK

When you TryParse on a number the result indicates that the value parsed is, or is not, a number. TryParse for email addresses cannot do this.

It can and it definitelly would. You mix two different things here. One is a check if some string _represents_ an email address, the other is if that email address _really exists_ (or is valid as you say). 藱TryParse藱 is about the first one. The same is with int.TryParse - it just checks if string is a nuber, but does not check if that number is valid in given context (day of birth example is a good one).

The other thing is - if you even send an email to that address, how can you be sure? What if the email server is down? Do you assume in this case that email is invalid?

especially as the rules for what a valid address has the potential to change, meaning the API would go out of date.

It is quite unlikely that the format of email address will change, but if it happens, this API will not change. It still will have the same input and output. What would really change is that regex, because it will need to adapt to new format.

To put it in simpler way MailAddress.TryParse methods ar not for checking if email is valid, but for checking if given string is in a format, which represents email address.

@euclid47

For both questions I would ask these on stackoverflow.com due to being off topic. Both questions are important and I have a few ideas to accomplish both tasks.

This is not about searching for some workaround, but it is an API proposal for .NET core. So this is the place where to ask/propose. If you have some ideas, why not share them?

@satano The two questions about searching through large datasets and using an email address as the only method of security. Both are valid questions but off topic for the original feature request. Those are best suited for stackoverflow.com.

@euclid47 OK. Thanks.

I have several projects where this would be useful (today I use the try-catch hack).

In my mind this would be very similar to the Uri.TryCreate method which already exists. The idea is to validate the structure, which is then exposed via MailAddress properties like Host.

Like with Uri, this of course cannot validate whether the address points to an actual email account.

I'd be okay with this so long as it allows just about any combination of special characters, as valid email addresses do. I'd be hard-pressed to find a reason to go further than checking something like \S\s*@\s*\S (at least one non-whitespace character on each side of the '@'). Even better (thanks @tannergooding):

bool IsValidEmailAddress(string candidate)
{
    if (candidate == null) return false;
    var trimmed = candidate.Trim();
    return trimmed.Length >= 3 && trimmed.IndexOf('@', 1, trimmed.Length - 2) != -1;
}

@jnm2 presumably this would run the exact same validation as a what the MailAddress constructor already does.

```c#
using System.ComponentModel.DataAnnotations;

public bool IsValidEmail(string source)
{
return new EmailAddressAttribute().IsValid(source);
}
```

Can't someone just port this over?

@truencoa interesting. Looks like EmailAddressAttribute uses a regex under the hood, which seems less robust than MailAddress's custom parser (perhaps not, though).

well, we tried using URI because that's just simpler, but couldn't figure out a way to parse it with and have our fail tests pass:

c# string emailAddress = "manythings@support@microsoft. com"; Uri uri; if (Uri.TryCreate(Uri.UriSchemeMailto + Uri.SchemeDelimiter + emailAddress, UriKind.Absolute, out uri)) { MailAddress mailAddress = new MailAddress(emailAddress); }
Uris are simply too accepting - especially with mailtos.

Regardless of implementation, I agree this would be a nice addition to the API.

First, there is the performance impact. Though probably minimal on today's machines, in a world of micro/nano-services, focus on milliseconds, and the .NET team's attention to performance in general, I feel this is an argument pro.

But there is also the fact that it would be easily discoverable. int.TryParse is well-known, even among beginning .NET programmers. Now, many developers would search online and probably find some regex that doesn't really match everything that is acceptable according to the standard (RFC 2822).

This makes it a good addition to the API in my opinion.

Open question: Should we add also Parse methods when we have constructors?

My opinion: that might be good because it follows the structure of int and other types, in that a Parse and TryParse method go hand in hand. Internally, you could of course point them to the constructor.
But that's just my opinion and I have little experience with framework and/or API design. I'm curious to read what others think.

Consistency is important.

And it's easier to use static functions as delegates than constructors.

My current use case is similar to @Grauenwolf. I am validating a textbox value as its entered. So for each character entered before the email is valid, I get an exception. If I am debugging with break on exception, then it becomes a bit problematic to test.

That's a good use case, thanks!

My opinion: that might be good because it follows the structure of int and other types, in that a Parse and TryParse method go hand in hand. Internally, you could of course point them to the constructor.
But that's just my opinion and I have little experience with framework and/or API design. I'm curious to read what others think.

I agree, this is a de-facto pattern established throughout the library and I'd prefer to preserve that. See Guid for an example of having both a parsing-ctor, a Parse, and a TryParse.

Video

  • If we were to expose the TryParse ones, we should also define corresponding static Parse methods.
  • The bigger question is whether we'd recommend MailAddress at all. The data annotations implementation for IsValid is much simpler, probably because email addresses are complex and in practice most validators reject rare, but valid email addreses.

triage: we're okay with adding both Parse and TryParse, and think there's enough value (and low effort) for this.

New API proposal:

```c#
public static MailAddress Parse(string address);
public static MailAddress Parse(string address, string displayName);
public static MailAddress Parse(string address, string displayName, Encoding displayNameEncoding);

public static bool TryParse(string address, out MailAddress result);
public static bool TryParse(string address, string displayName, out MailAddress result);
public static bool TryParse(string address, string displayName, Encoding displayNameEncoding, out MailAddress result);

```

The bigger question is whether we'd recommend MailAddress at all

One reason we've found MailAddress helpful is that it allows you to strongly type variables and fields that store email addresses (as opposed to just storing these as string). Not only does this enforce that invalid values are caught early on in the process, but also it makes it easier for callers to understand what type of value is required. It also gives convenient access to components of the address (like Host).

I see this as comparable to using Uri over string or TimeSpan over int/long.

That last question made no sense to me. When I'm checking rows in a database, I'm not going to use a DataAnnotation to detect email addresses.

And if the DA validation is rejecting legal email addresses, then it should be fixed.

  • It seems odd to use Parse/TryParse when there are more components that just a single string that is being parsed.
  • We should follow the Uri.TryCreate() pattern:

    • Remove Parse(), that's what the constructor does

    • Rename TryParse() to TryCreate()

```C#

nullable disable

public static bool TryCreate(string address, out MailAddress result);
public static bool TryCreate(string address, string displayName, out MailAddress result);
public static bool TryCreate(string address, string displayName, Encoding displayNameEncoding, out MailAddress result);
```

Uri.TryCreate doesn't follow the convention and shouldn't be mimicked. As far as I can tell, it's the only class that uses TryCreate instead of Parse/TryParse.

We should follow the Uri.TryCreate() pattern:

@terrajobst @scalablecory I don't think Uri is the best pattern to follow here. Most of the other types in .NET follow the Parse/TryParse pattern.

Uri.TryCreate doesn't follow the convention and shouldn't be mimicked. As far as I can tell, it's the only class that uses TryCreate instead of Parse/TryParse.

The feedback API review had for this was that this does not follow TryParse convention:

c# public static bool TryParse(string address, string displayName, Encoding displayNameEncoding, out MailAddress result);

TryParse typically only has a single string + optional style + optional formatter as input, i.e. every argument is intrinsic to parsing. What we have here with MailAddress.TryParse, though, is half-way between a parser and a constructor: some of the params are only used for initialization, not parsing.

The suggestion to use TryCreate came from Uri.TryCreate being prior art for a factory half-way between parser and constructor.

I hope this gives some context. @Grauenwolf @davidsh if you have strong opinion on design and can present a good counter-argument, I will take this back to API review.

Where are all of these extra parameters coming from? We're just asking for the parsing of a single string.

bool TryParse(string s, out MailAddress result)

While I assume it's possible, I can't think of any situation where I've had a separate display name and email address but didn't already know the email address was valid.

What we're looking for is to check a single column of values from a database or file feed. Usually as part of some sort of import job or data analysis.

Also note we do have multi-parameter TryParse methods. For example, DateTime.TryParseExact.

https://docs.microsoft.com/en-us/dotnet/api/system.datetime.tryparseexact?view=netframework-4.8

Other multi-parameter TryParse include Int32.TryParse and Timespan.TryParse

So there is no basis for the claim that TryParse strictly has a single input parameter.

Where are all of these extra parameters coming from? We're just asking for the parsing of a single string.

The proposed API currently mirrors all the ctors, which take more than a single string.

there is no basis for the claim that TryParse strictly has a single input parameter.

This is not a claim anyone made.

Current TryParse take only parameters that are specifically involved in parsing. In the example above, displayName and displayNameEncoding are not used for parsing but instead to initialize values.

I'm working on this and I agree with @scalablecory if we want to return an "instance" of MailAddress from parse(that in this case I would name input validation) we should allow all "possible" combinations allowed by ctors, for intance current code allow to override the display name also if it's present inside address part, we would lose this. I believe that api should be simmetric in this case.
Would be different if the semantic is only a "validation" one like bool MailAddress.IsValid(string address) and after we'll compose code validation+one of ctors.
A validation api could be also better from performance(no discarded MailAddress instance) perspective in scenario where "I need to check that 1mil email in my csv are ok".

Triage: we like TryCreate, can adjust later if there is more discussion. @MarcoRossignoli would you like me to assign this issue to you?

Yes @scalablecory feel free to assign to me.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jamesqo picture jamesqo  路  3Comments

v0l picture v0l  路  3Comments

noahfalk picture noahfalk  路  3Comments

matty-hall picture matty-hall  路  3Comments

jzabroski picture jzabroski  路  3Comments