Sync Data
Add facts in development
Whether starting fresh or making changes to existing rules, the quickest way to iterate on the facts stored in Oso Cloud is via the Fact Schema (opens in a new tab) in the UI. The Fact Schema lists the types of facts referenced in your policy; these are the types of facts Oso Cloud expects you to send.
To add a new fact, click + Add
next to the type of fact you want to add. To
remove an existing fact, click ▼ Show matching facts
and then click the Delete
button next to the fact you want to delete.
Sync facts in production
Initial sync
Once you've decided how to represent your authorization data in Oso Cloud, you'll need to do a one-time sync to bring Oso Cloud in-line with your data. We provide Oso Sync to update the facts in Oso Cloud to match those in your application database.
Updating facts via Oso Sync is currently in beta. Reach out if you're interested in learning more.
oso-cloud experimental reconcile reconcile.yaml --perform-updates
Configuration
In order for Oso Sync to know where to find the facts you need, you need to create a configuration yaml file, which maps your data to facts in Oso Cloud. We currently support the following data sources:
PostgreSQL
facts: has_relation(Repository:_, String:parent, Organization:_): db: app_db query: |- select repository.public_id, organization.public_id from repository join organization on organization.id = repository.organization_iddbs: app_db: connection_string: postgresql://oso:oso@somerds.instance.aws.com:5432/foo
The config file has two top level fields: facts
and dbs
.
dbs
contains a list of databases from which Oso Sync should pull the fact data. Each entry is keyed by a unique name and contains aconnection_string
value, which needs to conform to a PostgreSQL connection URI (opens in a new tab). Alternatively, you can provide an environment variable (prefixed with a$
) containing the connection string:connection_string: $ENV_VAR_NAME
.facts
maps fact types to the database query that fetches all facts of that type. Each fact type is defined with positional variable slots (specified by an underscore_
), which are filled by the query in order to generate the facts. For instance, the fact typehas_relation(Repository:_, String:parent, Organization:_)
has two variables: one in the first argument for theRepository
and one in the third argument for theOrganization
.db
is the database that contains fact data for this fact type. Its value should match an identifier from thedbs
section.query
is the query to fetch all facts of that fact type. Match the columns you're fetching data from positionally with the variables in the fact type. In the example above,repository.public_id
is set as the Repository value in the first argument of the fact type, andorganization.public_id
is set as the Organization value in the third argument.
MongoDB
version: 1source: mongodbfacts: has_relation(Repository:_, String:parent, Organization:_): db: app_db collection: has_relation fields: - name: repository - name: organization is_array: true query: find: {} # `find` and `aggregate` are mutually exclusive # aggregate: []dbs: app_db: connection_string: mongodb://oso:oso@somemongo.instance.aws.com:27017/foo
The config file has four top level fields: version
, source
, dbs
, and facts
.
version
should have a value of1
.source
should have a value ofmongodb
.dbs
contains a list of databases from which Oso Sync should pull the fact data. Each entry is keyed by a unique name and contains aconnection_string
value, which needs to conform to a MongoDB connection URI (opens in a new tab). Alternatively, you can provide an environment variable (prefixed with a$
) containing the connection string:connection_string: $ENV_VAR_NAME
.facts
maps fact types to the database query that fetches all facts of that type. Each fact type is defined with positional variable slots (specified by an underscore_
), which are filled by the query in order to generate the facts. For instance, the fact typehas_relation(Repository:_, String:parent, Organization:_)
has two variables: one in the first argument for theRepository
and one in the third argument for theOrganization
.db
is the database that has the collection with the data for this fact type.collection
is the collection that contains data for this fact type.fields
is an array containing the names of the fields to extract from the documents returned by the query. Each array item maps to the positional variable in the fact type, and all variables must be included. An item may have an optionalis_array
field; ifis_array
istrue
, the field on the document must be an array type and is automatically unwound. At most one field may be configured withis_array: true
.query
is the query to fetch all documents that contain data for the fact type. Eitherfind
oraggregate
field may be used for the query, and these are passed directly to the MongoDBfind
andaggregate
, respectively. The example above illustrates a query usingfind
. Foraggregate
queries, use of the$out
stage results in an error.
Comma-separated Values (CSV)
version: 1source: csvfacts: has_relation(Repository:_, String:parent, Organization:_): fields: - name: repository - name: organization path: /path/to/has_relation.csv
The config file has three top level fields: version
, source
, and facts
.
version
should have a value of1
.source
should have a value ofcsv
.facts
map fact types to the CSV file with the data of that type. Each fact type is defined with positional variable slots (specified by an underscore_
), which are filled with data from the corresponding values in the CSV file. For instance, the fact typehas_relation(Repository:_, String:parent, Organization:_)
has two variables: one in the first argument for theRepository
and one in the third argument for theOrganization
.fields
is an array containing the names of the values to extract from the CSV file. The first row in CSV file must be a header row and must include all of the items in thefields
array. Each array item maps to the positional variable in the fact type, and all variables must be included.path
is the path to the CSV file with the data for the fact type.
Add and remove facts
Oso Sync will soon be usable for adding and removing facts in production. In the meantime, you can use the approach described below.
Whenever you insert, update, or delete authorization-relevant data in your application, you should use Oso Cloud's Bulk API to mirror that update in Oso Cloud.
This "dual writes" approach is similar to updating an Elasticsearch index to provide up-to-date search results. Oso Cloud is a fast and flexible index for your authorization data that's optimized for producing sub-millisecond authorization decisions.
For example, in our GitCloud (opens in a new tab) example app, when a user creates a new repository, we send a pair of facts to Oso Cloud:
def create_repository(org_id): org = Organization(org_id) repo = Repository(payload["name"], org) # Open a transaction to persist the repository to our datastore. session.add(repo) # Send facts to Oso Cloud. oso.bulk([ # No facts to remove: delete=[], # Two facts to add: tell=[ # The parent organization of `repo` is `org`. ["has_relation", repo, "organization", org], # The creating user gets the "admin" role on the new repository. ["has_role", current_user, "admin", repo] ] ]) # Once the bulk update to Oso Cloud succeeds, commit the transaction. session.commit() return repo.as_json(), 201
When deleting a repository, the process is identical, but the facts in the Bulk API call go in the removal array. Additionally, you can use wildcards to remove all facts matching a pattern:
oso.bulk([ # Two fact patterns to remove: delete=[ # Remove all `has_relation` facts for the repository. ["has_relation", repo, None, None], # Remove all `has_role` facts for the repository. ["has_role", None, None, repo] ], # No facts to add: tell=[]])
Wildcards are represented as None
in Python, null
in JavaScript, nil
in
Ruby, and so on.
When creating new resources, send corresponding facts to Oso Cloud before closing the local transaction. This way, we tell the user we’ve created the new resource once they’re able to access it.
When deleting existing resources, remove corresponding facts from Oso Cloud after closing the local transaction. We wait to remove access until we’re sure the resource no longer exists.
To add and remove facts in a single transaction — for example, when updating a
user's role from member
to admin
— use the Bulk API:
oso.bulk([ # Remove all existing roles for the user: delete=[["has_role", user, None, repo]], # Add the new role: tell=[["has_role", user, "admin", repo]]])
The Bulk API processes fact removals before additions, so after the above call
the user has exactly one role on the repository: admin
.
Keep facts in sync
To ensure authorization data remains in sync with application data, it's good practice to periodically refresh the facts in Oso Cloud. You can use Oso Sync to identify and report on data drift. Using the configuration file from the Initial Sync configuration, run:
oso-cloud experimental reconcile reconcile.yaml
This retuns the diff over stdout. If 1000 or fewer facts have changed, Oso Sync returns the lists of facts to add or remove:
{ "type": "facts", "fact_types": [ { "fact_type": <Fact>, "add": [<Fact>, ...], "remove": [<Fact>, ...] } ]}
If more than 1000 facts have changed, Oso Sync returns the counts instead:
{ "type": "counts", "fact_types": [ { "fact_type": <Fact>, "add_count": 501, "remove_count": 500, } ]}
Oso Sync formats facts in their fully-expanded JSON representation.
Any variables in the fact type are represented by a null
value:
{ "predicate": "has_relation", "args": [ { "type": "Repository", "id": null }, { "type": "String", "id": "parent" }, { "type": "Organization", "id": null } ]}
Docker
We publish a wrapped up version of the CLI (x86_64) for Oso Sync at public.ecr.aws/osohq/reconcile:latest
.
To use it, build your own image on top of this using a Dockerfile like this:
FROM public.ecr.aws/osohq/reconcile:latestARG CONFIG_PATHRUN test -n "$CONFIG_PATH" || (echo "CONFIG_PATH argument must be set to path of your reconcile.yaml" && false)WORKDIR /appCOPY $CONFIG_PATH /app/config.yamlENTRYPOINT ["/app/reconcile", "experimental", "reconcile", "/app/config.yaml"]
Build it with: docker build -t reconcile-tool -f reconcile-tool.Dockerfile --build-arg="CONFIG_PATH=./reconcile.yaml" --platform linux/amd64 .
.
Talk to an Oso engineer
If you'd like to learn more about using Oso Cloud in your app or have any questions about this guide, connect with us on Slack. We're happy to help.