Skip to content

Service

meganno_client.service.Service

Service objects communicate to back-end MEGAnno services and establish connections to a MEGAnno project.

__init__(host=None, project=None, token=None, auth=None, port=5000)

Init function

Parameters:

Name Type Description Default
host str

Host IP address for the back-end service to connect to. If None, connects to a Megagon-hosted service.

None
project str

Project name. The name needs to be unique within the host domain.

None
token str

User's authentication token.

None
auth Authentication

Authentication object. Can be skipped if a valid token is provided.

None

show(config={})

Show project management dashboard in a floating dashboard.

get_service_endpoint(key=None)

Get REST endpoint for the connected project. Endpoints are composed from base project url and routes for specific requests.

Parameters:

Name Type Description Default
key str

Name of the specific request. Mapping to routes is stored in a dictionary SERVICE_ENDPOINTS in constants.py.

None

get_base_payload()

Get the base payload for any REST request which includes the authentication token.

get_schemas()

Get schema object for the connected project.

get_statistics()

Get the statistics object for the project which supports calculations in the management dashboard.

get_users_by_uids(uids: list = [])

Get user names by their unique IDs.

Parameters:

Name Type Description Default
uids list

list of unique user IDs.

[]

get_annotator()

Get annotator's own name and user ID. The back-end service distinguishes annotator by the token or auth object used to initialize the connection.

search(limit=DEFAULT_LIST_LIMIT, skip=0, uuid_list=None, keyword=None, regex=None, record_metadata_condition=None, annotator_list=None, label_condition=None, label_metadata_condition=None, verification_condition=None)

Search the back-end database based on user-provided predicates.

Parameters:

Name Type Description Default
limit

The limit of returned records in the subest.

DEFAULT_LIST_LIMIT
skip

skip index of returned subset (excluding the first skip rows from the raw results ordered by importing order).

0
uuid_list

list of record uuids to filter on

None
keyword

Term for exact keyword searches.

None
regex

Term for regular expression searches.

None
record_metadata_condition

{"name": # name of the record-level metadata to filter on "opeartor": "=="|"<"|">"|"<="|">="|"exists", "value": # value to complete the expression}

None
annotator_list

list of annotator names to filter on

None
label_condition

Label condition of the annotation. {"name": # name of the label to filter on "opeartor": "=="|"<"|">"|"<="|">="|"exists"|"conflicts", "value": # value to complete the expression}

None
label_metadata_condition

Label metadata condition of the annotation. Note this can be on different labels than label_condition {"label_name": # name of the associated label "name": # name of the label-level metadata to filter on "operator": "=="|"<"|">"|"<="|">="|"exists", "value": # value to complete the expression}

None
verification_condition

verification condition of the annotation. {"label_name": # name of the associated label "search_mode":"ALL"|"UNVERIFIED"|"VERIFIED"}

None

Returns:

Name Type Description
subset Subset

Subset meeting the search conditions.

deprecate_submit_annotations(subset=None, uuid_list=[])

Submit annotations for records in a subset to the back-end service database. Results are filtered to only include annotations owned by the authenticated annotator.

Parameters:

Name Type Description Default
subset Subset

The subset object containing records and annotations.

None
uuid_list list

Additional filter. Only subset records whose uuid are in this list will be submitted.

[]

submit_annotations(subset=None, uuid_list=[])

Submit annotations for a batch of records in a subset to the back-end service database. Results are filtered to only include annotations owned by the authenticated annotator.

Parameters:

Name Type Description Default
subset Subset

The subset object containing records and annotations.

None
uuid_list list

Additional filter. Only subset records whose uuid are in this list will be submitted.

[]

import_data_url(url='', file_type=None, column_mapping={})

Import data from a public url, currently only supporting csv files. Each row corresponds to a data record. The file needs at least two columns: one with a unique id for each row, and one with the raw data content.

Parameters:

Name Type Description Default
url str

Public url for csv file

''
file_type str

Currently only supporting type 'CSV'

None
column_mapping dict

Dictionary with fields id specifying id column name, and content specifying content column name. For example, with a csv file with two columns index and tweet:

{
    "id": "index",
    "content": "tweet"
}

{}

import_data_df(df, column_mapping={})

Import data from a pandas DataFrame. Each row corresponds to a data record. The dataframe needs at least two columns: one with a unique id for each row, and one with the raw data content.

Parameters:

Name Type Description Default
df DataFrame

Qualifying dataframe

required
column_mapping dict

Dictionary with fields id specifying id column name, and content specifying content column name. Using a dataframe, users can import metadata at the same time. For example, with a csv file with two columns index and tweet, and a column location:

{
    "id": "index",
    "content": "tweet",
    "metadata": "location"
}
metadata with name location will be created for all imported data records.

{}

export()

Exporting function.

Returns:

Name Type Description
export_df DataFrame

A pandas dataframe with columns 'data_id', 'content', 'annotator', 'label_name', 'label_value' for all records in the project

set_metadata(meta_name, func, batch_size=500)

Set metadata for all records in the back-end database, based on user-defined function for metadata calculation.

Parameters:

Name Type Description Default
meta_name str

Name of the metadata. Will be used to identify and query the metadata.

required
func function(raw_content)

Function which takes input the raw data content and returns the corresponding metadata (int, string, vectors...).

required
batch_size int

Batch size for back-end database updates.

500
Example
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')
# set metadata generation function for service object demo
demo.set_metadata("bert-embedding",
                  lambda x: list(model.encode(x).astype(float)), 500)

get_assignment(annotator=None, latest_only=False)

Get workload assignment for annotator.

Parameters:

Name Type Description Default
annotator str

User ID to query. If set to None, use ID of auth token holder.

None
latest_only bool

If true, return only the last assignment for the user. Else, return the set of all assigned records.

False