Validator.js: IsSlug

Created on 26 Dec 2018  路  11Comments  路  Source: validatorjs/validator.js

It would be nice to have an isSlug validator.

good first issue hacktoberfest help wanted 馃檵鈥嶁檪锔廸eature-request

Most helpful comment

Sorry for the delay! I need to open GitHub more often.

I too have ditched underscores in slugs but as this is a validation function and not linting we should not be too opinionated. In theory the minus sign and underscore also have different meanings.

But thank you for your input and especially the link to the video! The Wikipedia use-case really opened my eyes to a fact I completely overlooked: URIs - including domain names - have non-ASCII characters in them.

Well... I think the best way around all these issues is having an options object with some opinionated defaults :-)

Here's my current version. It's missing the non-ASCII characters but I guess we can add that as another option later on and make it backwards compatible by setting the default to ASCII-only.

const default_slug_options = {
    allow_underscore: false,
    allow_uppercase: false,
    allow_parenthesis: false,
};

function isSlug(str, options) {
    assertString(str);
    options = merge(options, default_slug_options);

    let charset = 'a-z0-9\\-';

    if (options.allow_underscore) {
        charset += '_';
    }

    if (options.allow_uppercase) {
        charset += 'A-Z';
    }

    // Curly brackets are considered unsafe characters:
    // https://tools.ietf.org/html/rfc1738#section-2.2
    // Angle brackets are excluded (because used as delimiters):
    // https://tools.ietf.org/html/rfc2396#section-2.4.3
    // Square brakets are reserved characters:
    // https://tools.ietf.org/html/rfc3986#section-2.2
    if (options.allow_parenthesis) {
        charset += '\\(\\);'
    }

    let regex_charset = new RegExp(`^[${charset}]+$`);
    let regex_boundaries_consecutive = /[-_]{2,}/;
    let regex_boundaries_leading = /^[-_]/;
    let regex_boundaries_trailing = /[-_]$/;

    return regex_charset.test(str) &&
           !regex_boundaries_consecutive.test(str) &&
           !regex_boundaries_leading.test(str) &&
           !regex_boundaries_trailing.test(str);
}

All 11 comments

I am new to coding and open source. How can I try this?

Hi, I'm starting to work on this, I'll send a PR in the near future.

Maybe we should define what the requirements for a slug are.

For me a slug should be limited to:

  • Strings with at least 1 character
  • ASCII letters from A to Z (lower-case)
  • Numbers 0 through 9
  • Minus signs (-) and underscores (_)

Thought I would share this gist as it seems to cover all of the cases:
https://gist.github.com/hagemann/382adfc57adbd5af078dc93feef01fe1

Good point! I missed the consecutive minus signs. Amending my list:

  • Should not have consecutive minus signs or underscores

As there are no more inputs to what constitutes a valid slug and no pull requests have been issued I decided to convert my list to code to keep things rolling.

/**
 * Checks if a given string is a valid slug for use in URIs.
 * 
 * A slug must ...
 * 
 * ... have at least one character.
 * 
 * ... be limited to:
 *  a) lower-case, ASCII letters from A to Z
 *  b) number 0 through 9
 *  c) minus signs and underscores
 *
 * ... not contain consecutive minus signs and/or underscores.
 * 
 * @param {string} str The string to validate as a slug.
 * @returns {boolean} Whether or not the string is a valid slug.
 */
function isSlug(str) {
    assertString(str);
    return str.length > 0 &&
           /^[\x61-\x7A0-9-_]*$/.test(str) &&
           !/[-_]{2,}/.test(str);
}

But we should really have a discussion about the requirements of a slug.

For example: Should a slug be allowed to start and/or end with a minus sign or underscore?

Slugs should never start or end with dashes or underscores. Dashes vs underscores

The origins of the term slug actually came from the newspaper industry as short name to use for an article. Marketing agencies will recommend using dashes for SEO.

Some might argue to that underscore are a valid to use in slugs. I found that when typing URLs, underscores require 2 keys to be pressed rather than one. It makes typing a URL easier when using dashes, should someone feel the need to type out a URL :)

Also some slugs implementations might also strip out words that don't actually describe the string such as the following string:

How to create (and complete) an SEO audit report for your clients
create-seo-audit-report-clients

Hope this helps.

Sorry for the delay! I need to open GitHub more often.

I too have ditched underscores in slugs but as this is a validation function and not linting we should not be too opinionated. In theory the minus sign and underscore also have different meanings.

But thank you for your input and especially the link to the video! The Wikipedia use-case really opened my eyes to a fact I completely overlooked: URIs - including domain names - have non-ASCII characters in them.

Well... I think the best way around all these issues is having an options object with some opinionated defaults :-)

Here's my current version. It's missing the non-ASCII characters but I guess we can add that as another option later on and make it backwards compatible by setting the default to ASCII-only.

const default_slug_options = {
    allow_underscore: false,
    allow_uppercase: false,
    allow_parenthesis: false,
};

function isSlug(str, options) {
    assertString(str);
    options = merge(options, default_slug_options);

    let charset = 'a-z0-9\\-';

    if (options.allow_underscore) {
        charset += '_';
    }

    if (options.allow_uppercase) {
        charset += 'A-Z';
    }

    // Curly brackets are considered unsafe characters:
    // https://tools.ietf.org/html/rfc1738#section-2.2
    // Angle brackets are excluded (because used as delimiters):
    // https://tools.ietf.org/html/rfc2396#section-2.4.3
    // Square brakets are reserved characters:
    // https://tools.ietf.org/html/rfc3986#section-2.2
    if (options.allow_parenthesis) {
        charset += '\\(\\);'
    }

    let regex_charset = new RegExp(`^[${charset}]+$`);
    let regex_boundaries_consecutive = /[-_]{2,}/;
    let regex_boundaries_leading = /^[-_]/;
    let regex_boundaries_trailing = /[-_]$/;

    return regex_charset.test(str) &&
           !regex_boundaries_consecutive.test(str) &&
           !regex_boundaries_leading.test(str) &&
           !regex_boundaries_trailing.test(str);
}

@Heydad-Helfer I can see that you worked on this issue on your repository but have not raised a Pull request.
Could you help us get a PR up for review on the same? We will really appreciate.
Looking forward to your reply.
Cc. @profnandaa

OMG. I completely forgot about this. I started a new job after posting this and got completely sidetracked. If someone wants to use my snipped for a PR, feel free.

Sorry for the abandonment!

@fkm I will check on this someday later or someone else can check on it.
Thanks for the response.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

spyshower picture spyshower  路  3Comments

AtomicBorg picture AtomicBorg  路  3Comments

mfbx9da4 picture mfbx9da4  路  4Comments

mren picture mren  路  3Comments

woverton picture woverton  路  4Comments