Sync Data

Add facts in development

Whether starting fresh or making changes to existing rules, the quickest way to iterate on the facts stored in Oso Cloud is via the Fact Schema (opens in a new tab) in the UI. The Fact Schema lists the types of facts referenced in your policy; these are the types of facts Oso Cloud expects you to send.

Fact Schema

To add a new fact, click + Add next to the type of fact you want to add. To remove an existing fact, click ▼ Show matching facts and then click the Delete button next to the fact you want to delete.

Sync facts in production

Initial sync

Once you've decided how to represent your authorization data in Oso Cloud, you'll need to do a one-time sync to bring Oso Cloud in-line with your data. We provide Oso Sync to update the facts in Oso Cloud to match those in your application database.

Updating facts via Oso Sync is currently in beta. Reach out if you're interested in learning more.


oso-cloud experimental reconcile reconcile.yaml --perform-updates

Configuration

In order for Oso Sync to know where to find the facts you need, you need to create a configuration yaml file, which maps your data to facts in Oso Cloud. We currently support the following data sources:

PostgreSQL

facts:
has_relation(Repository:_, String:parent, Organization:_):
db: app_db
query: |-
select repository.public_id, organization.public_id
from repository
join organization
on organization.id = repository.organization_id
dbs:
app_db:
connection_string: postgresql://oso:oso@somerds.instance.aws.com:5432/foo

The config file has two top level fields: facts and dbs.

  • dbs contains a list of databases from which Oso Sync should pull the fact data. Each entry is keyed by a unique name and contains a connection_string value, which needs to conform to a PostgreSQL connection URI (opens in a new tab). Alternatively, you can provide an environment variable (prefixed with a $) containing the connection string: connection_string: $ENV_VAR_NAME.
  • facts maps fact types to the database query that fetches all facts of that type. Each fact type is defined with positional variable slots (specified by an underscore _), which are filled by the query in order to generate the facts. For instance, the fact type has_relation(Repository:_, String:parent, Organization:_) has two variables: one in the first argument for the Repository and one in the third argument for the Organization.
    • db is the database that contains fact data for this fact type. Its value should match an identifier from the dbs section.
    • query is the query to fetch all facts of that fact type. Match the columns you're fetching data from positionally with the variables in the fact type. In the example above, repository.public_id is set as the Repository value in the first argument of the fact type, and organization.public_id is set as the Organization value in the third argument.
MongoDB

version: 1
source: mongodb
facts:
has_relation(Repository:_, String:parent, Organization:_):
db: app_db
collection: has_relation
fields:
- name: repository
- name: organization
is_array: true
query:
find: {}
# `find` and `aggregate` are mutually exclusive
# aggregate: []
dbs:
app_db:
connection_string: mongodb://oso:oso@somemongo.instance.aws.com:27017/foo

The config file has four top level fields: version, source, dbs, and facts.

  • version should have a value of 1.
  • source should have a value of mongodb.
  • dbs contains a list of databases from which Oso Sync should pull the fact data. Each entry is keyed by a unique name and contains a connection_string value, which needs to conform to a MongoDB connection URI (opens in a new tab). Alternatively, you can provide an environment variable (prefixed with a $) containing the connection string: connection_string: $ENV_VAR_NAME.
  • facts maps fact types to the database query that fetches all facts of that type. Each fact type is defined with positional variable slots (specified by an underscore _), which are filled by the query in order to generate the facts. For instance, the fact type has_relation(Repository:_, String:parent, Organization:_) has two variables: one in the first argument for the Repository and one in the third argument for the Organization.
    • db is the database that has the collection with the data for this fact type.
    • collection is the collection that contains data for this fact type.
    • fields is an array containing the names of the fields to extract from the documents returned by the query. Each array item maps to the positional variable in the fact type, and all variables must be included. An item may have an optional is_array field; if is_array is true, the field on the document must be an array type and is automatically unwound. At most one field may be configured with is_array: true.
    • query is the query to fetch all documents that contain data for the fact type. Either find or aggregate field may be used for the query, and these are passed directly to the MongoDB find and aggregate, respectively. The example above illustrates a query using find. For aggregate queries, use of the $out stage results in an error.
Comma-separated Values (CSV)

version: 1
source: csv
facts:
has_relation(Repository:_, String:parent, Organization:_):
fields:
- name: repository
- name: organization
path: /path/to/has_relation.csv

The config file has three top level fields: version, source, and facts.

  • version should have a value of 1.
  • source should have a value of csv.
  • facts map fact types to the CSV file with the data of that type. Each fact type is defined with positional variable slots (specified by an underscore _), which are filled with data from the corresponding values in the CSV file. For instance, the fact type has_relation(Repository:_, String:parent, Organization:_) has two variables: one in the first argument for the Repository and one in the third argument for the Organization.
    • fields is an array containing the names of the values to extract from the CSV file. The first row in CSV file must be a header row and must include all of the items in the fields array. Each array item maps to the positional variable in the fact type, and all variables must be included.
    • path is the path to the CSV file with the data for the fact type.

Add and remove facts

Oso Sync will soon be usable for adding and removing facts in production. In the meantime, you can use the approach described below.

Whenever you insert, update, or delete authorization-relevant data in your application, you should use Oso Cloud's Bulk API to mirror that update in Oso Cloud.

💡

This "dual writes" approach is similar to updating an Elasticsearch index to provide up-to-date search results. Oso Cloud is a fast and flexible index for your authorization data that's optimized for producing sub-millisecond authorization decisions.

For example, in our GitCloud (opens in a new tab) example app, when a user creates a new repository, we send a pair of facts to Oso Cloud:


def create_repository(org_id):
org = Organization(org_id)
repo = Repository(payload["name"], org)
# Open a transaction to persist the repository to our datastore.
session.add(repo)
# Send facts to Oso Cloud.
oso.bulk([
# No facts to remove:
delete=[],
# Two facts to add:
tell=[
# The parent organization of `repo` is `org`.
["has_relation", repo, "organization", org],
# The creating user gets the "admin" role on the new repository.
["has_role", current_user, "admin", repo]
]
])
# Once the bulk update to Oso Cloud succeeds, commit the transaction.
session.commit()
return repo.as_json(), 201

When deleting a repository, the process is identical, but the facts in the Bulk API call go in the removal array. Additionally, you can use wildcards to remove all facts matching a pattern:


oso.bulk([
# Two fact patterns to remove:
delete=[
# Remove all `has_relation` facts for the repository.
["has_relation", repo, None, None],
# Remove all `has_role` facts for the repository.
["has_role", None, None, repo]
],
# No facts to add:
tell=[]
])

Wildcards are represented as None in Python, null in JavaScript, nil in Ruby, and so on.

💡

When creating new resources, send corresponding facts to Oso Cloud before closing the local transaction. This way, we tell the user we’ve created the new resource once they’re able to access it.

When deleting existing resources, remove corresponding facts from Oso Cloud after closing the local transaction. We wait to remove access until we’re sure the resource no longer exists.

To add and remove facts in a single transaction — for example, when updating a user's role from member to admin — use the Bulk API:


oso.bulk([
# Remove all existing roles for the user:
delete=[["has_role", user, None, repo]],
# Add the new role:
tell=[["has_role", user, "admin", repo]]
])

The Bulk API processes fact removals before additions, so after the above call the user has exactly one role on the repository: admin.

Keep facts in sync

To ensure authorization data remains in sync with application data, it's good practice to periodically refresh the facts in Oso Cloud. You can use Oso Sync to identify and report on data drift. Using the configuration file from the Initial Sync configuration, run:


oso-cloud experimental reconcile reconcile.yaml

This retuns the diff over stdout. If 1000 or fewer facts have changed, Oso Sync returns the lists of facts to add or remove:


{
"type": "facts",
"fact_types": [
{
"fact_type": <Fact>,
"add": [<Fact>, ...],
"remove": [<Fact>, ...]
}
]
}

If more than 1000 facts have changed, Oso Sync returns the counts instead:


{
"type": "counts",
"fact_types": [
{
"fact_type": <Fact>,
"add_count": 501,
"remove_count": 500,
}
]
}

Oso Sync formats facts in their fully-expanded JSON representation. Any variables in the fact type are represented by a null value:


{
"predicate": "has_relation",
"args": [
{ "type": "Repository", "id": null },
{ "type": "String", "id": "parent" },
{ "type": "Organization", "id": null }
]
}

Docker

We publish a wrapped up version of the CLI (x86_64) for Oso Sync at public.ecr.aws/osohq/reconcile:latest. To use it, build your own image on top of this using a Dockerfile like this:


FROM public.ecr.aws/osohq/reconcile:latest
ARG CONFIG_PATH
RUN test -n "$CONFIG_PATH" || (echo "CONFIG_PATH argument must be set to path of your reconcile.yaml" && false)
WORKDIR /app
COPY $CONFIG_PATH /app/config.yaml
ENTRYPOINT ["/app/reconcile", "experimental", "reconcile", "/app/config.yaml"]

Build it with: docker build -t reconcile-tool -f reconcile-tool.Dockerfile --build-arg="CONFIG_PATH=./reconcile.yaml" --platform linux/amd64 ..

Talk to an Oso engineer

If you'd like to learn more about using Oso Cloud in your app or have any questions about this guide, connect with us on Slack. We're happy to help.

Get started with Oso Cloud →