Usage Guide

Concepts

Agnitio is organized around the concept of a dataset. A dataset encapsulates what has been measured and how it has been described.

The “what” is called metrics (e.g., jobs, earnings).
The “how” is called dimensions (e.g., occupation, industry, area).

Example: US Industry Dataset

Dimensions:

Area – Nation, states, and counties
Industry – Industries as defined by a recent version of the NAICS standard
Class of Worker – Employed versus self-employed

Metrics:

Jobs – Average number of jobs over the year
Earnings – Total earnings for the year
Establishments – Average number of establishments

This dataset allows querying the total (sum) number of jobs in a particular area (or group of areas) for all or selected industries and employment classes.

Time is treated as a special dimension that combines with metrics. For a given combination of areas, industries, and classes, you can request multiple years of data in a single call.

1. Dataset Versioning

Datasets are versioned, generally by time. Lightcast produces regular releases for its core datasets, including both historical and projected data.

For example, the 2023.3 and 2023.4 versions of the US Industry dataset cover the same years, but 2023.4 may include updated sources or adjusted estimates.
API consumers are recommended to use the latest version, unless working on a long-term project that requires consistent data.

The latest version of any dataset can be found by querying the /meta/<dataset> endpoint.

2. Hierarchies

Dimensions are organized as hierarchical taxonomies. For example, Lightcast's standard U.S. Area dimension follows the FIPS (Federal Information Processing Standards) system for states and counties:

Nation
State
County

Each county has exactly one parent state. Example:

Latah County, ID → code 16057 → parent State: Idaho → code 16

Similar hierarchies exist for:

Industries
Occupations
Classes of worker
Gender, race, ethnicity, and other demographic facets

The contents of a dimension may vary between dataset versions. Taxonomies are updated regularly to reflect changes in base data sources.

Data Discovery

Agnitio provides metadata endpoints that help you understand what datasets are available, their versions, and how they are structured. These endpoints allow you to explore datasets before making queries. The datasets you see depend on your Lightcast contract.

1. List Available Datasets

Use the /meta endpoint to return all datasets you can access and their available versions.

curl --request GET \
  --url https://agnitio.emsicloud.com/meta \
  --header 'Authorization: bearer <access_token>'

Response:

{
  "datasets": [
    {
      "name": "emsi.us.industry",
      "versions": [
        "2023.4",
        "2024.1",
        "2024.2",
        "2024.3"
      ]
    },
    {
      "name": "emsi.us.unemployment.age",
      "versions": [
        "2023.4",
        "2024.1",
        "2024.2",
        "2024.3"
      ]
    }
  ]
}

2. Dataset Definitions

Add/definitions to the path to return dataset metadata, including descriptions and use cases.

curl --request GET \
  --url https://agnitio.emsicloud.com/meta/definitions \
  --header 'Authorization: bearer <access_token>'

Response:

{
  "datasets": [
    {
      "name": "EMSI.us.MinimumWage",
      "versions": [
        "2023.4",
        "2024.1",
        "2024.2",
        "2024.3"
      ],
      "description": "# Description\n\nThis dataset shows minimum wage by state by year, from 2001 to the most recent year available. Data is available for all 50 states plus the District of Columbia, as well as the US minimum wage. In cases where cities within the state have different minimum wage laws than the state itself, the state's minimum wage is shown. In all cases where the state's minimum wage is less than the US minimum wage, the US minimum wage is shown.\n\nOnly identity mappings are allowed along StateID.\n\n# Use Cases\n\n#### Questions answered by this dataset:\n\n* What was the minimum wage in Oregon in 2011?\n* What is the minimum wage in the state of Washington?\n* What state has the highest minimum wage?\n\n# Metrics\n\n* MinimumWage: Shows the minimum wage for the state and year selected; if the state minimum wage is less than the nation, then the nation is shown.\n\n# Filters\n\n* Area (Nation, State)\n* Year\n",
      "title": "US Minimum Wage"
    }
  ]
}

3. Dataset Versions

Use dataset/<name> to return all available versions of a specific dataset.:

curl --request GET \
  --url https://agnitio.emsicloud.com/meta/dataset/emsi.us.industry \
  --header 'Authorization: bearer <access_token>'

Response:

[
  "2023.4",
  "2024.1",
  "2024.2",
  "2024.3"
]

4. Dataset Version Metadata

Use dataset/<name>/<version> to return details about a dataset version, including attributes, metrics, and dimensions.

curl --request GET \
  --url https://agnitio.emsicloud.com/meta/dataset/emsi.us.industry/2024.2 \
  --header 'Authorization: bearer <access_token>'

Response:

{
  "datasetName": "EMSI.US.Industry",
  "versionName": "2024.2",
  "attributes": {
    "estabStartYear": "2004",
    "minYearInclusive": "2001",
    "estabYear": "2023",
    "name": "Industry",
    "path": "Industry",
    "type": "dataset",
    "currentJobsQuarters": "2022Q4,2023Q1,2023Q2,2023Q3",
    "maxYearInclusive": "2034",
    "description": "# Description\n\nEmsi's industry dataset contains information on industries back to 2001. Job counts are projected 10 years beyond the current calendar year; earnings are not projected. Available areas are nation, states, MSAs, counties, ZIPs, and tracts. Establishment data is not available for ZIP or tract level.\n\nWhen requesting MSA codes or ZIP codes as the area constraint in a query, the code must be prepended by MSA or ZIP, respectively (e.g. for ZIP code 98102, use 'ZIP98102')\n\n# Use Cases\n\n#### Questions answered by this dataset:\n\n* How many manufacturing jobs are projected to exist over the next 10 years in Calvert County, MD?\n* How many Insurance Carriers establishments are there in Nebraska?\n* What is the average wage for workers in hospitals in Ann Arbor, Michigan?\n* How much higher is the average wage today for workers in hospitals than it was in 2005?\n* How many health care jobs exist in the ZIP codes that make up downtown Seattle?\n* How do earnings for Registered Nurses differ in the various parts of Los Angeles County?\n\n# Metrics\n\n* Jobs: The number of occupied positions. This is not quite the same as workers because one worker might fill more than one position.\n* Wages: Total base earnings for the industry. Note that this figure is for the whole industry, not for the average worker in the industry.\n* Supplements: Total supplements for the industry. This figure is also for the whole industry, not for the average worker in the industry.\n* Establishments: The number of physical business locations.\n* Earnings: Total earnings for the industry. Sum of wages and supplements. This figure is also for the whole industry, not for the average worker in the industry.\n* EPW: Earnings Per Worker. The earnings figure (sum of wages and salaries) divided by the number of jobs in the industry.\n* SPW: Supplements Per Worker. Supplements for the industry divided by the number of jobs in the industry.\n* WPW: Wages Per Worker. Wages for the industry divided by the number of jobs in the industry.\n\n# Filters\n\n* Class of Worker\n* Area (Nation, State, MSA, County, ZIP, tract)\n* Industry (2 to 6-digit NAICS)\n* Year\n",
    "earnYear": "2023",
    "currentYear": "2023",
    "countryCode": "us",
    "datarun": "2024.2",
    "displayName": "US Industry (Nation, State, MSA, County, ZIP, Census Tract)",
    "releaseDate": "2024-04-12 07:25:42.424191Z",
    "numAggPaths_AreaID": "4",
    "levelsStored_AreaID": "[1:[1,2,3,4],2:[2,3,4],3:[1,2,3,4],4:[2,3,4]]"
  },
  "dimensions": [
    {
      "name": "ClassOfWorker",
      "levelsStored": ["2"]
    },
    {
      "name": "Area",
      "levelsStored": ["1", "2", "3", "4"]
    },
    {
      "name": "Industry",
      "levelsStored": ["1", "2", "3", "4", "5", "6"]
    }
  ],
  "metrics": [
    { "name": "Jobs.2001" },
    { "name": "Jobs.2002" },
    { "name": "Jobs.2034" },
    { "name": "Wages.2001" }
  ]
}

5. Dimension Hierarchies

Use dataset/<name>/<version>/<dimension> to return the full hierarchy of a dataset dimension.

curl --request GET \
  --url https://agnitio.emsicloud.com/meta/dataset/emsi.us.industry/2024.2/Area \
  --header 'Authorization: bearer <access_token>'

Response:

{
  "name": "Area",
  "hierarchy": [
    {
      "name": "United States",
      "abbr": "US",
      "child": "0",
      "display_id_parent": "0",
      "parent": "0",
      "aggregation_path": "1",
      "level_name": "1"
    },
    {
      "name": "United States",
      "abbr": "US",
      "child": "0",
      "display_id_parent": "0",
      "parent": "0",
      "aggregation_path": "3",
      "level_name": "1"
    },
    {
      "name": "Alabama",
      "abbr": "AL",
      "child": "1",
      "display_id_parent": "0",
      "parent": "0",
      "aggregation_path": "1",
      "level_name": "2"
    },
    {
      "name": "Alabama",
      "abbr": "AL",
      "child": "1",
      "display_id_parent": "0",
      "parent": "0",
      "aggregation_path": "3",
      "level_name": "2"
    },
    {
      "name": "Delaware",
      "abbr": "DE",
      "child": "10",
      "display_id_parent": "0",
      "parent": "0",
      "aggregation_path": "1",
      "level_name": "2"
    }
  ]
}

6. Filtering Dimension Results

You can filter dimension results in two ways:

By Aggregation Path and Level:

Use query parameters like aggpath and level to refine results by hierarchy.

curl --request GET \
  --url 'https://agnitio.emsicloud.com/meta/dataset/emsi.us.industry/2024.2/Area?aggpath=3&level=4' \
  --header 'Authorization: bearer <access_token>'

By Identifier or Name:

Search within a dimension using identifiers such as FIPS codes or location names.

curl --request GET \
  --url https://agnitio.emsicloud.com/meta/dataset/emsi.us.industry/2024.2/Area/search/Latah \
  --header 'Authorization: bearer <access_token>'

Data Queries

1. Basics

Agnitio data queries are performed by constructing a JSON object that describes the query and sending it via POST to the dataset endpoint.

Here’s an example querying the US Industry dataset to get the number of jobs and establishments in Latah County, ID for the full-service restaurant industry:

curl --request POST \
  --url https://agnitio.emsicloud.com/emsi.us.industry/2024.2 \
  --header 'Authorization: bearer <access_token>' \
  --header 'Content-Type: application/json' \
  --data '{ 
    "metrics": [ 
      { "name": "Jobs.2023", "as": "2023 Jobs" }, 
      { "name": "Establishments.2023" } 
    ], 
    "constraints": [ 
      { "dimensionName": "Area", "map": { "Latah County, ID": ["16057"] } }, 
      { "dimensionName": "Industry", "map": { "Full Service Restaurants": ["722511"] } } 
    ] 
  }'

Response:

{
  "data": [
    {
      "name": "Area",
      "type": "String",
      "rows": ["Latah County, ID"]
    },
    {
      "name": "Industry",
      "type": "String",
      "rows": ["Full Service Restaurants"]
    },
    {
      "name": "2023 Jobs",
      "type": "Double",
      "rows": [507.7011804944611]
    },
    {
      "name": "Establishments.2023",
      "type": "Double",
      "rows": [30.25]
    }
  ],
  "errors": [],
  "timings": [
    "Build sortable results: 0.15ms",
    "Sort results: 0.16ms",
    "Build final response: 0.02ms",
    "Parsed constraints: 2.74ms",
    "Parsed metrics: 0.09ms",
    "Parse query: 3.03ms",
    "Generate query cartesian product: 0.07ms",
    "Generate parallelization plan: 0.03ms",
    "Execute parallel query at: 2.27ms",
    "Overall query time: 12.35ms"
  ],
  "totalRows": 1
}

Query structure explained:

The JSON object contains two required fields: metrics and constraints.
metrics is an array of objects specifying which measures you want. Each object has a required name field and an optional as field to rename the metric in the response.
constraints is an array of objects specifying how the dimensions should be filtered or aggregated. In the example, the Area dimension is limited to a single FIPS code, "16057", representing "Latah County, ID".
Responses are column-oriented, so all arrays under rows have the same length, even if the query returns multiple rows.

Understanding the `map` field:

The map field defines how user-defined names correspond to dimension codes. For example:

{
  "dimensionName": "Area",
  "map": {
    "Seattle Area (10 mile radius)": ["53033", "53035"],
    "Seattle Area (20 mile radius)": ["53033", "53035", "53061", "53053"]
  }
}

Each key in the map becomes a value in the response.
Multiple codes can be combined to form a single user-defined area.

For datasets supporting ZIP and MSA areas, prepend the type to each code (e.g., ZIP98101). FIPS codes do not require a prefix.

Shorthand: `mapLevel`:

Instead of mapping each child individually, you can use mapLevel to expand a parent to all its children automatically:

{
  "dimensionName": "Area",
  "mapLevel": {
    "level": 3,
    "predicate":[16]
  }
}

This expands all children of the specified parent (here, state code 16) to individual entries.
The expanded mapping is equivalent to manually listing all FIPS codes, which saves time on large queries:

{
  "dimensionName": "Area",
  "map": {
    "16003": ["16003"],
    "16005": ["16005"],
    ...
    "16057": ["16057"],
    ...
    "16999": ["16999"]
  }
}

2. Sorting, Offsets, and Limits

Agnitio allows you to control the order of results using the sortBy field. You can sort by one or more metrics or dimensions in ascending or descending order.

{
  "metrics": [{"name": "Jobs.2020"}],
  "constraints": [
    {
      "dimensionName": "Area",
      "map": {"Seattle Area (10 mile radius)": ["53033", "53035"]}
    }
  ],
  "sortBy": [
    {"name": "Area", "direction": "ascending"},
    {"name": "Jobs.2020", "direction": "descending"}
  ]
}

By default, all requested data is returned, but you can use offset and limit for pagination:

{
  "metrics": [ ... ],
  "constraints": [ ... ],
  "offset": 100,
  "limit": 50
}

offset skips the first N rows of the results.
limit restricts the number of rows returned.These fields are useful for large datasets or when implementing paging in applications.

3. Location Quotient

Location Quotient (LQ) measures geographic concentration of metrics like jobs or establishments. Agnitio can calculate LQ automatically for datasets with an Area dimension:

{
  "metrics": [
    {
      "name": "Jobs.2020",
      "as": "Jobs 2020 LQ",
      "operation": {
        "name": "LocationQuotient",
        "geoparent": "0",
        "along": "Industry"
      }
    }
  ],
  "constraints": [
    {
      "dimensionName": "Area",
      "map": {"Latah County, ID": ["16057"]}
    },
    {
      "dimensionName": "Industry",
      "map": {"Full Service Restaurants": ["722511"]}
    }
  ]
}

Operation fields explained:

operation.name – identifies this as a Location Quotient calculation.
operation.geoparent – the parent area code for comparison. Use 0 for national, or the containing state for state-level LQ.
operation.along – the dimension along which to compute concentration, usually Industry or Occupation.

4. Shift Share

Shift Share analyzes changes in a metric over time and attributes them to different factors. Agnitio can calculate Shift Share for datasets with an Area dimension and metrics with a time component:

{
  "metrics": [
    {
      "name":"Jobs.2020",
      "operation":{
        "name":"ShiftShare",
        "geoparent":"0",
        "along":"Industry",
        "base": "Jobs.2010"
      }
    }
  ],
  "constraints": [ ... ]
}

Operation fields explained:

operation.name – identifies this as a Shift Share calculation.
operation.geoparent – the parent area code for comparison. Use 0 for national, or the containing state for state-level analysis.
operation.along – the dimension to perform the comparison along, usually Industry or Occupation.
operation.base – the starting metric for comparison (here, analyzing change from 2010 to 2020).

Response format:

Shift Share metrics return four columns per metric:
- Job Change
- Parent Growth Effect
- Mix Effect
- Competitive Effect
Column names are prefixed with the as field from the metric if supplied.

Concepts

Example: US Industry Dataset

Dimensions:

Metrics:

1. Dataset Versioning

2. Hierarchies

Each county has exactly one parent state. Example:

Similar hierarchies exist for:

Data Discovery

1. List Available Datasets

Response:

2. Dataset Definitions

Response:

3. Dataset Versions

Response:

4. Dataset Version Metadata

Response:

5. Dimension Hierarchies

Response:

6. Filtering Dimension Results

By Aggregation Path and Level:

By Identifier or Name:

Data Queries

1. Basics

Response:

Query structure explained:

Understanding the map field:

Shorthand: mapLevel:

2. Sorting, Offsets, and Limits

3. Location Quotient

Operation fields explained:

4. Shift Share

Operation fields explained:

Response format:

Understanding the `map` field:

Shorthand: `mapLevel`: