eta.util.general.standardize

standardize(str, remove_parentheticals=False)[source]

Standardize a string by applying a series of transformations.

Specifically:

  1. Replace – with -, and _ with whitespace.

  2. Remove parenthetical content (i.e., […] or ), if remove_parentheticals is True.

  3. Add whitespace around all punctuation.

  4. Collapse all whitespace to a single space.

  5. Convert to lowercase.