API Reference

This section is auto-generated from the source code docstrings.

Connection

class snowloader.SnowConnection[source]

Bases: object

Manages a session against one ServiceNow instance.

Supports four authentication modes (checked in priority order):

  1. Bearer token - pass a pre-obtained token directly via token. No user credentials needed. Useful when auth is handled outside the library (SSO, external token service).

  2. OAuth 2.0 Client Credentials - pass client_id and client_secret without username/password. Best for server-to-server integrations with no human user involved.

  3. OAuth 2.0 Password Grant - pass all four: client_id, client_secret, username, password. Token is acquired lazily on first request and refreshed on 401.

  4. Basic Auth - pass username and password only. Simplest; fine for development, not recommended for production.

Can be used as a context manager for clean session shutdown:

with SnowConnection(...) as conn:
    loader = IncidentLoader(connection=conn)
    docs = loader.load()
Parameters:
  • instance_url (str) – Full URL of the ServiceNow instance, e.g. "https://mycompany.service-now.com". Trailing slashes are stripped automatically.

  • username (str | None) – ServiceNow user account for authentication.

  • password (str | None) – Password for the user account.

  • client_id (str | None) – OAuth client ID. Enables OAuth when combined with client_secret.

  • client_secret (str | None) – OAuth client secret.

  • token (str | None) – Pre-obtained Bearer token. When provided, all other credentials are ignored.

  • page_size (int) – Records per API call during pagination (1-10 000). Defaults to 100.

  • timeout (int) – HTTP request timeout in seconds. Defaults to 60.

  • max_retries (int) – Retry attempts for transient failures (429, 502, 503, 504). Defaults to 3.

  • retry_backoff (float) – Base delay (seconds) between retries; doubles on each attempt. Defaults to 1.0.

  • request_delay (float) – Minimum seconds between consecutive API requests. Helps avoid rate limiting. Defaults to 0 (no delay).

  • display_value (str) – Controls sysparm_display_value parameter. "true" (default) returns human-readable labels for reference fields. "false" returns raw values. "all" returns both {display_value, value} dicts.

  • proxy (str | None) – Optional proxy URL, e.g. "http://proxy:8080". Applied to all HTTP(S) requests.

  • verify (bool | str) – SSL verification. True (default) uses system CA bundle. Pass a path string to a CA bundle file for custom certificates. False disables verification (not recommended for production).

Raises:

SnowConnectionError – If credentials are missing or invalid, or if instance_url is malformed.

Example

>>> conn = SnowConnection(
...     instance_url="https://mycompany.service-now.com",
...     username="api_user",
...     password="api_pass",
... )
>>> for record in conn.get_records("incident", query="active=true"):
...     print(record["number"])
__init__(instance_url, username=None, password=None, client_id=None, client_secret=None, token=None, page_size=100, timeout=60, max_retries=3, retry_backoff=1.0, request_delay=0.0, display_value='true', proxy=None, verify=True)[source]
Parameters:
  • instance_url (str)

  • username (str | None)

  • password (str | None)

  • client_id (str | None)

  • client_secret (str | None)

  • token (str | None)

  • page_size (int)

  • timeout (int)

  • max_retries (int)

  • retry_backoff (float)

  • request_delay (float)

  • display_value (str)

  • proxy (str | None)

  • verify (bool | str)

Return type:

None

__enter__()[source]
Return type:

SnowConnection

__exit__(exc_type, exc_val, exc_tb)[source]
Return type:

None

Parameters:
close()[source]

Close the underlying HTTP session and release resources.

Safe to call multiple times. After closing, further API calls will raise an error from the requests library.

Return type:

None

get_records(table, query=None, fields=None, since=None)[source]

Fetch records from a ServiceNow table with automatic pagination.

Yields one record dict at a time so callers can process large result sets without holding everything in memory. Pagination continues until the API returns fewer records than page_size, which signals we have reached the last page.

Parameters:
  • table (str) – ServiceNow table name, e.g. “incident” or “cmdb_ci_server”.

  • query (str | None) – Optional encoded query string, e.g. “active=true^priority=1”. An ORDERBYsys_created_on suffix is appended automatically.

  • fields (list[str] | None) – Optional list of field names to include in the response. When omitted, ServiceNow returns all fields on the table.

  • since (datetime | None) – Optional datetime for delta/incremental sync. When set, only records updated after this timestamp are returned.

Yields:

Individual record dicts straight from the ServiceNow response.

Raises:

SnowConnectionError – On any non-2xx response from the API.

Return type:

Generator[dict[str, object], None, None]

get_count(table, query=None, since=None)[source]

Return the total record count for a table query.

Hits /api/now/stats/<table> which is much cheaper than a paginated read and is required by concurrent_get_records() so it can plan page offsets.

Parameters:
  • table (str) – ServiceNow table name.

  • query (str | None) – Optional encoded query string.

  • since (datetime | None) – Optional delta sync cutoff.

Return type:

int

Returns:

Integer record count, or 0 if the response shape is unexpected.

Raises:

SnowConnectionError – On any non-2xx response from the API.

concurrent_get_records(table, query=None, fields=None, since=None, max_workers=16)[source]

Fetch records using a thread pool so pages download in parallel.

Sequential get_records() walks pages one at a time. For large tables (hundreds of thousands of records) that gets slow. This method pre-fetches the total count, splits into pages, and dispatches the page fetches to a thread pool. Each worker thread holds its own requests.Session so connection pools and TLS state stay isolated, which avoids the connection-reuse failure modes some ServiceNow front ends exhibit when many concurrent requests share a single client session.

Records are yielded in the order pages complete, NOT in ORDERBYsys_created_on order. If you need ordered output, sort the consumed list yourself by the relevant timestamp.

Parameters:
  • table (str) – ServiceNow table name.

  • query (str | None) – Optional encoded query.

  • fields (list[str] | None) – Optional list of field names to request.

  • since (datetime | None) – Optional delta sync cutoff.

  • max_workers (int) – Number of worker threads (default 16). Each worker holds its own requests.Session.

Yields:

One record dict at a time as pages arrive.

Raises:

SnowConnectionError – On count failure or any unrecoverable page error.

Return type:

Generator[dict[str, object], None, None]

get_attachment(sys_id)[source]

Download the binary content of one sys_attachment record.

Hits the /api/now/attachment/<sys_id>/file endpoint and returns the raw bytes. Honors the connection’s auth, retries, and timeout settings.

Parameters:

sys_id (str) – The sys_id of the attachment record.

Return type:

bytes

Returns:

Raw bytes of the attachment file.

Raises:

SnowConnectionError – On any non-2xx response or network failure.

get_record(table, sys_id)[source]

Fetch a single record by its sys_id.

Parameters:
  • table (str) – ServiceNow table name.

  • sys_id (str) – The unique sys_id of the record to fetch.

Return type:

dict[str, object]

Returns:

The record dict from the API response.

Raises:

SnowConnectionError – If the record does not exist or the API returns an error status.

class snowloader.SnowConnectionError[source]

Bases: Exception

Raised when something goes wrong talking to the ServiceNow API.

status_code

HTTP status code if the error came from an API response. None for network-level failures (timeout, DNS, connection refused).

detail

Human-readable error detail extracted from the response body or the underlying exception message.

__init__(message, status_code=None, detail='')[source]
Parameters:
  • message (str)

  • status_code (int | None)

  • detail (str)

Return type:

None

Models

class snowloader.SnowDocument[source]

Bases: object

A single document extracted from a ServiceNow table.

This is the intermediate format that lives between the raw API response and whatever the framework adapters produce. Every loader yields these, and every adapter consumes them.

page_content

The main text content of the document. How this is assembled depends on the specific loader (could be a short description, a KB article body, a concatenation of fields, etc).

metadata

Key-value pairs describing where this document came from. Typically includes sys_id, number, table name, and any other fields the loader considers useful for retrieval or filtering.

page_content: str
metadata: dict[str, Any]
__init__(page_content, metadata=<factory>)
Parameters:
Return type:

None

class snowloader.BaseSnowLoader[source]

Bases: object

Shared foundation for all ServiceNow table loaders.

Subclasses must define:

table: The ServiceNow table name to query (e.g. “incident”). content_fields: List of field names whose values get concatenated

into the document’s page_content.

The base class takes care of pagination, document assembly, delta sync, and journal fetching. Most loaders will not need to override anything beyond the class attributes, but _record_to_document() is available as a hook for loaders that need fancier content formatting.

Parameters:
  • connection (SnowConnection) – An initialized SnowConnection instance.

  • query (str | None) – Optional encoded query string for filtering records.

  • fields (list[str] | None) – Optional list of specific fields to request from the API. When left as None, the API returns all fields on the table.

  • include_journals (bool) – Whether to fetch and append work notes and comments from sys_journal_field for each record.

Example

class IncidentLoader(BaseSnowLoader):

table = “incident” content_fields = [“short_description”, “description”]

conn = SnowConnection(…) loader = IncidentLoader(connection=conn, query=”active=true”) for doc in loader.lazy_load():

print(doc.page_content)

table: str = ''
content_fields: list[str] = []
__init__(connection, query=None, fields=None, include_journals=False)[source]
Parameters:
Return type:

None

load()[source]

Fetch all matching records and return them as a list.

This is the simple, non-streaming interface. Under the hood it just drains lazy_load() into a list. For large tables, prefer lazy_load() directly to avoid holding everything in memory.

Return type:

list[SnowDocument]

Returns:

List of SnowDocument instances, one per record.

lazy_load(since=None)[source]

Fetch records and yield them one at a time as SnowDocuments.

This is the primary loading interface. It streams records through SnowConnection’s paginated API and converts each one to a document on the fly. Memory usage stays flat regardless of how many records are in the table.

Parameters:

since (datetime | None) – Optional cutoff datetime for delta sync. When set, only records updated after this point are fetched.

Yields:

SnowDocument instances, one per ServiceNow record.

Return type:

Generator[SnowDocument, None, None]

load_since(since)[source]

Fetch only records updated after the given datetime.

Convenience wrapper around lazy_load() for incremental syncing. Pass the timestamp of your last successful sync and you will only get records that changed since then.

Parameters:

since (datetime) – Cutoff datetime. Records with sys_updated_on after this value are included.

Return type:

list[SnowDocument]

Returns:

List of SnowDocument instances for the updated records.

concurrent_lazy_load(since=None, max_workers=16)[source]

Fetch records in parallel and yield them as SnowDocuments.

This is the threaded counterpart to lazy_load(). It dispatches page fetches across a ThreadPoolExecutor inside SnowConnection, so wall clock time on large tables drops roughly proportional to max_workers (subject to ServiceNow rate limits and instance capacity). Memory stays flat because results are still streamed as they arrive.

Records are yielded in the order pages complete, which is not the same as sys_created_on order. If you need a stable ordering, sort downstream or use lazy_load() instead.

Parameters:
  • since (datetime | None) – Optional cutoff datetime for delta sync. When set, only records updated after this point are fetched.

  • max_workers (int) – Number of worker threads to use for page fetches. Defaults to 16. Higher values speed up large tables but may trip ServiceNow rate limits on smaller instances.

Yields:

SnowDocument instances, one per ServiceNow record, in the order their pages complete (not sys_created_on order).

Return type:

Generator[SnowDocument, None, None]

concurrent_load(max_workers=16)[source]

Fetch all matching records in parallel and return them as a list.

Threaded counterpart to load(). Drains concurrent_lazy_load() into a list, so the same ordering caveat applies: documents come back in page-completion order, not sys_created_on order. For very large tables, prefer concurrent_lazy_load() to keep memory bounded.

Parameters:

max_workers (int) – Number of worker threads to use for page fetches. Defaults to 16.

Return type:

list[SnowDocument]

Returns:

List of SnowDocument instances, one per record, in the order their pages completed.

Loaders

class snowloader.IncidentLoader[source]

Bases: BaseSnowLoader

Loads incident records from ServiceNow.

Produces documents with a structured text layout that includes the incident number, summary, full description, current state, priority, category, assignment info, relevant dates, and optionally the resolution notes and journal entries (work notes + comments).

The text format is designed to give language models enough context to answer questions about incidents without needing to understand ServiceNow’s data model. Each section is clearly labeled so retrieval systems can match on specific parts of the content.

Parameters:
  • connection (SnowConnection) – An initialized SnowConnection instance.

  • query (str | None) – Optional encoded query for filtering incidents.

  • fields (list[str] | None) – Optional field list. If not set, the loader requests all fields needed for document assembly.

  • include_journals (bool) – If True, fetches work notes and comments from sys_journal_field and appends them to each document.

Example

conn = SnowConnection(

instance_url=”https://mycompany.service-now.com”, username=”api_user”, password=”api_pass”,

) loader = IncidentLoader(conn, query=”active=true^priority<=2”) for doc in loader.lazy_load():

print(doc.page_content[:200])

table: str = 'incident'
content_fields: list[str] = ['short_description', 'description']
class snowloader.KnowledgeBaseLoader[source]

Bases: BaseSnowLoader

Loads Knowledge Base articles from ServiceNow.

Produces documents where page_content contains the article title followed by the cleaned body text. HTML from the text field is automatically stripped and converted to plain text. If the text field is empty, the loader falls back to the wiki field.

Metadata includes the article number, knowledge base name, topic, category, author, workflow state, and timestamps.

Parameters:
  • connection (SnowConnection) – An initialized SnowConnection instance.

  • query (str | None) – Optional encoded query for filtering articles.

  • fields (list[str] | None) – Optional field list override.

  • include_journals (bool) – Whether to append journal entries.

Example

conn = SnowConnection(…) loader = KnowledgeBaseLoader(conn, query=”workflow_state=published”) for doc in loader.lazy_load():

print(doc.page_content[:200])

table: str = 'kb_knowledge'
content_fields: list[str] = ['short_description', 'text']
class snowloader.CMDBLoader[source]

Bases: BaseSnowLoader

Loads CMDB Configuration Items with optional relationship traversal.

By default targets the base cmdb_ci table, but you can point it at any CMDB class (cmdb_ci_server, cmdb_ci_service, etc.) with the ci_class parameter. When include_relationships is True, the loader makes additional queries to cmdb_rel_ci for each CI to map out the dependency and containment graph.

The resulting documents include the CI’s identity (name, class, status), technical details (IP, FQDN, OS when available), assignment info, and a relationship section showing connected CIs with direction arrows.

Parameters:
  • connection (SnowConnection) – An initialized SnowConnection instance.

  • ci_class (str | None) – CMDB class table name. Defaults to “cmdb_ci” which covers all CI types. Use “cmdb_ci_server” etc. for specific classes.

  • query (str | None) – Optional encoded query for filtering CIs.

  • fields (list[str] | None) – Optional field list override.

  • include_relationships (bool) – If True, fetches relationship data from cmdb_rel_ci for each CI. This adds 2 API calls per CI so it is off by default.

Example

conn = SnowConnection(…) loader = CMDBLoader(

conn, ci_class=”cmdb_ci_server”, query=”operational_status=1”, include_relationships=True,

) for doc in loader.lazy_load():

print(doc.page_content[:300])

content_fields: list[str] = ['name', 'short_description']
__init__(connection, ci_class=None, query=None, fields=None, include_relationships=False, max_relationship_workers=2)[source]
Parameters:
Return type:

None

table: str = 'cmdb_ci'
class snowloader.ChangeLoader[source]

Bases: BaseSnowLoader

Loads change request records from ServiceNow.

Produces documents that capture the full change lifecycle: what is being changed, the risk assessment, the implementation schedule, who is responsible, and which CI is affected. Optionally includes journal entries for CAB notes, implementation updates, and post-change reviews.

Parameters:
  • connection (SnowConnection) – An initialized SnowConnection instance.

  • query (str | None) – Optional encoded query for filtering change requests.

  • fields (list[str] | None) – Optional field list override.

  • include_journals (bool) – If True, fetches work notes and comments.

Example

conn = SnowConnection(…) loader = ChangeLoader(conn, query=”state=3”) # Implement state for doc in loader.lazy_load():

print(doc.page_content[:200])

table: str = 'change_request'
content_fields: list[str] = ['short_description', 'description']
class snowloader.ProblemLoader[source]

Bases: BaseSnowLoader

Loads problem records from ServiceNow.

Produces documents that include the problem description, root cause analysis (when available), known error flagging, and fix notes. This gives language models the context they need for pattern recognition across incidents and proactive problem identification.

Parameters:
  • connection (SnowConnection) – An initialized SnowConnection instance.

  • query (str | None) – Optional encoded query for filtering problems.

  • fields (list[str] | None) – Optional field list override.

  • include_journals (bool) – If True, fetches work notes and comments.

Example

conn = SnowConnection(…) loader = ProblemLoader(conn, query=”known_error=true”) for doc in loader.lazy_load():

print(doc.page_content[:200])

table: str = 'problem'
content_fields: list[str] = ['short_description', 'description']
class snowloader.CatalogLoader[source]

Bases: BaseSnowLoader

Loads service catalog items from ServiceNow.

Produces documents that describe catalog offerings with their name, description, category, price, and catalog association. The text layout is designed for retrieval systems that help end users find the right service to request.

Parameters:
  • connection (SnowConnection) – An initialized SnowConnection instance.

  • query (str | None) – Optional encoded query for filtering catalog items.

  • fields (list[str] | None) – Optional field list override.

Example

conn = SnowConnection(…) loader = CatalogLoader(conn, query=”active=true”) for doc in loader.lazy_load():

print(doc.page_content[:200])

table: str = 'sc_cat_item'
content_fields: list[str] = ['name', 'short_description', 'description']

Field Utilities

Shared field extraction utilities for ServiceNow API responses.

These helpers normalize the varied shapes that ServiceNow fields can take depending on the sysparm_display_value setting. Every loader in the package uses them to safely pull human-readable text and raw sys_id values from record dicts.

Author: Roni Das

snowloader.loaders._field_utils.display_value(field)[source]

Extract the human-readable display value from a field.

ServiceNow reference fields come back in different shapes depending on the sysparm_display_value setting:

  • "true": {"display_value": "John Smith", "link": "..."}

  • "all": {"display_value": "John Smith", "value": "abc123"}

  • "false": "abc123" (plain string)

  • Empty: None or ""

This function normalizes all of them into a simple string.

Parameters:

field (Any) – Raw field value from the API response.

Return type:

str

Returns:

The display string, or empty string for None/empty values.

snowloader.loaders._field_utils.raw_value(field)[source]

Extract the underlying sys_id or raw value from a field.

Counterpart to display_value(). When we need the actual sys_id behind a reference field (for linking, dedup, etc.), this pulls the value key from the dict format. With sysparm_display_value=true, reference fields arrive as {"display_value": "...", "link": "..."} so we fall back to extracting the sys_id from the link URL.

Parameters:

field (Any) – Raw field value from the API response.

Return type:

str

Returns:

The raw value string, or empty string for None/empty values.

snowloader.loaders._field_utils.parse_boolean(field)[source]

Convert a ServiceNow boolean field to a Python bool.

ServiceNow returns boolean fields as strings ("true"/"false"), but depending on the display_value setting they might also come back as actual booleans, integers (0/1), or None.

Parameters:

field (Any) – Raw field value from the API response.

Return type:

bool

Returns:

True if the field represents a truthy value, False otherwise.

HTML Cleaner

Lightweight HTML-to-text converter for ServiceNow content.

ServiceNow stores Knowledge Base article bodies as HTML, which is not great for feeding into language models or embedding pipelines. This module provides a simple, dependency-free cleaner that strips HTML tags and converts common constructs into readable plain text.

We intentionally avoid pulling in BeautifulSoup or lxml for this. The HTML coming out of ServiceNow is relatively tame (mostly paragraphs, lists, basic formatting) and a regex-based approach handles it well enough. The re module from stdlib is all we need.

Author: Roni Das

snowloader.utils.html_cleaner.clean_html(raw_html)[source]

Convert an HTML string to clean plain text.

Handles the common patterns found in ServiceNow KB articles:
  • <br> and <br/> tags become newlines

  • </p>, </div>, </li> tags become newlines for paragraph separation

  • All remaining HTML tags are stripped

  • HTML entities (&amp;, &lt;, &gt;, etc.) are decoded

  • Excess whitespace and blank lines are collapsed

Parameters:

raw_html (str) – The HTML string to clean. Can be empty.

Return type:

str

Returns:

Plain text with reasonable formatting preserved.