layout |
---|
reference |
including tab separated (tsv
), comma separated (csv
), Excel (xls
, xlsx
), JSON, XML, RDF as XML, Google Spreadsheets
{:auto_ids}
csv : A file extension indicating that a text file that has values separated by commas (comma-separated-values).
Clustering : A method for finding different groups of values that may actually be representing the same thing.
Faceting : A method for exploring the values in a variable. In this episode it is used to explore the values in order to identify errors in data entry.
Filter : To select a subset of data from a dataframe.
JSON : A file extension indicating that the values in a text file are structured using JavaScript Object Notation (JSON).
RDF : A file that extension indicating that the values in a file are structured using Resource Description Framework (RDF).
Regular expressions (regex) : A text string for describing a search pattern. They usually incorporate the use of wildcards to match letters, numbers, punctuation, spacing, or some combination.
tsv : A file extension indicating that a text file that has values separated by tabs (tab-separated-values).
xls : A file extension indicating that a file is a spreadsheet created by Microsoft Excel.
xlsx : A file extension indicating that a file is a spreadsheet created by Microsoft Excel using XML.
XML : A file extension indicating that the values in a file are structured using Extensible Markup Language (XML).