Text Functions

Data cleanup

Unlike numbers, text (also known as strings) usually comes in non-standard formats where a clean-up is needed. This is due to the broad input possibilities that text fields allow for. Text can contain numbers, blank spaces, special characters etc. It is not uncommon that a certain text input is separated in several fields.

Problems with text fields:

  • Blank spaces: Too many blank spaces, that may be hard to spot.
  • Wrong input: E.g. surname in first name field and first name in surname field.
  • Misunderstandings: E.g. what is considered a middle name or what is the format for country code?
  • Hard to analyze: Hard to extract and analyze information, e.g. large text fields with comments instead of drop-lists with limited choices.

To force users to input cleaner data from the start, text fields should be restricted with Data Validation if possible. This is often the case to some extent in software like ERP systems, but there still a lot of room for input errors. There can also be problematic programming logic like programmers of the ERP only created one input field for a contact name, and you want to work with surnames. Text functions in Excel can handle much of the basic cleanup needed due to input errors and faulty programming logic.

Lesson content