Skip to main content
Version: v0.13

Domain Layer Schema

Summary

This document describes Apache DevLake's domain layer schema.

Referring to DevLake's architecture, the data in the domain layer is transformed from the data in the tool layer. The tool layer schema is based on the data from specific tools such as Jira, GitHub, Gitlab, Jenkins, etc. The domain layer schema can be regarded as an abstraction of tool-layer schemas.

Domain layer schema itself includes 2 logical layers: a DWD layer and a DWM layer. The DWD layer stores the detailed data points, while the DWM is the slight aggregation and operation of DWD to store more organized details or middle-level metrics.

Use Cases

  1. All metrics from pre-built dashboards are based on this data schema.
  2. As a user, you can create your own customized dashboards based on this data schema.
  3. As a contributor, you can refer to this data schema while working on the ETL logic when adding/updating data source plugins.

Data Models

This is the up-to-date domain layer schema for DevLake v0.10.x. Tables (entities) are categorized into 5 domains.

  1. Issue tracking domain entities: Jira issues, GitHub issues, GitLab issues, etc.
  2. Source code management domain entities: Git/GitHub/Gitlab commits and refs(tags and branches), etc.
  3. Code review domain entities: GitHub PRs, Gitlab MRs, etc.
  4. CI/CD domain entities: Jenkins jobs & builds, etc.
  5. Cross-domain entities: entities that map entities from different domains to break data isolation.

Schema Diagram

Domain Layer Schema

When reading the schema, you'll notice that many tables' primary key is called id. Unlike auto-increment id or UUID, id is a string composed of several parts to uniquely identify similar entities (e.g. repo) from different platforms (e.g. Github/Gitlab) and allow them to co-exist in a single table.

Tables that end with WIP are still under development.

Naming Conventions

  1. The name of a table is in plural form. Eg. boards, issues, etc.
  2. The name of a table which describe the relation between 2 entities is in the form of [BigEntity in singular form]_[SmallEntity in plural form]. Eg. board_issues, sprint_issues, pull_request_comments, etc.
  3. Value of the field in enum type are in capital letters. Eg. table.issues.type has 3 values, REQUIREMENT, BUG, INCIDENT. Values that are phrases, such as 'IN_PROGRESS' of table.issues.status, are separated with underscore '_'.

Get all domain layer model info.

All domain layer models can be accessed by the following method

import "github.com/apache/incubator-devlake/models/domainlayer/domaininfo"

domaininfo := domaininfo.GetDomainTablesInfo()
for _, table := range domaininfo {
// do something
}

If you want to learn more about plugin models,please visit PluginImplementation

DWD Entities - (Data Warehouse Detail)

Domain 1 - Issue Tracking

issues

An issue is the abstraction of Jira/Github/GitLab/TAPD/... issues.

fieldtypelengthdescriptionkey
idvarchar255An issue's id is composed of < plugin >:< Entity >:< PK0 >[:PK1]..."
  • For Github issues, a Github issue's id is like "github:GithubIssues:< GithubIssueId >". Eg. 'github:GithubIssues:1049355647'
  • For Jira issues, a Github repo's id is like "jira:JiraIssues:< JiraSourceId >:< JiraIssueId >". Eg. 'jira:JiraIssues:1:10063'. < JiraSourceId > is used to identify which jira source the issue came from, since DevLake users can import data from several different Jira instances at the same time.
PK
issue_keyvarchar255The key of this issue. For example, the key of this Github issue is 1145.
urlvarchar255The url of the issue. It's a web address in most cases.
titlevarchar255The title of an issue
descriptionlongtextThe detailed description/summary of an issue
typevarchar255The standard type of this issue. There're 3 standard types:
  • REQUIREMENT: this issue is a feature
  • BUG: this issue is a bug found during test
  • INCIDENT: this issue is a bug found after release
The 3 standard types are transformed from the original types of an issue. The transformation rule is set in the '.env' file or 'config-ui' before data collection. For issues with an original type that has not mapped to a standard type, the value of type will be the issue's original type.
statusvarchar255The standard statuses of this issue. There're 3 standard statuses:
  • TODO: this issue is in backlog or to-do list
  • IN_PROGRESS: this issue is in progress
  • DONE: this issue is resolved or closed
The 3 standard statuses are transformed from the original statuses of an issue. The transformation rule:
  • For Jira issue status: transformed from the Jira issue's statusCategory. Jira issue has 3 default status categories: 'To Do', 'In Progress', 'Done'.
  • For Github issue status:
    • open -> TODO
    • closed -> DONE
original_statusvarchar255The original status of an issue.
story_pointintThe story point of this issue. It's default to an empty string for data sources such as Github issues and Gitlab issues.
priorityvarchar255The priority of the issue
componentvarchar255The component a bug-issue affects. This field only supports Github plugin for now. The value is transformed from Github issue labels by the rules set according to the user's configuration of .env by end users during DevLake installation.
severityvarchar255The severity level of a bug-issue. This field only supports Github plugin for now. The value is transformed from Github issue labels by the rules set according to the user's configuration of .env by end users during DevLake installation.
parent_issue_idvarchar255The id of its parent issue
epic_keyvarchar255The key of the epic this issue belongs to. For tools with no epic-type issues such as Github and Gitlab, this field is default to an empty string
original_estimate_minutesintThe orginal estimation of the time allocated for this issue
time_spent_minutesintThe orginal estimation of the time allocated for this issue
time_remaining_minutesintThe remaining time to resolve the issue
creator_idvarchar255The id of issue creator
creator_namevarchar255The name of the creator
assignee_idvarchar255The id of issue assignee.
  • For Github issues: this is the last assignee of an issue if the issue has multiple assignees
  • For Jira issues: this is the assignee of the issue at the time of collection
assignee_namevarchar255The name of the assignee
created_datedatetime3The time issue created
updated_datedatetime3The last time issue gets updated
resolution_datedatetime3The time the issue changes to 'DONE'.
lead_time_minutesintDescribes the cycle time from issue creation to issue resolution.
  • For issues whose type = 'REQUIREMENT' and status = 'DONE', lead_time_minutes = resolution_date - created_date. The unit is minute.
  • For issues whose type != 'REQUIREMENT' or status != 'DONE', lead_time_minutes is null

issue_labels

This table shows the labels of issues. Multiple entries can exist per issue. This table can be used to filter issues by label name.

fieldtypelengthdescriptionkey
namevarchar255Label name
issue_idvarchar255Issue IDFK_issues.id

issue_comments(WIP)

This table shows the comments of issues. Issues with multiple comments are shown as multiple records. This table can be used to calculate metric - issue response time.

fieldtypelengthdescriptionkey
idvarchar255The unique id of a commentPK
issue_idvarchar255Issue IDFK_issues.id
account_idvarchar255The id of the account who made the commentFK_accounts.id
bodylongtextThe body/detail of the comment
created_datedatetime3The creation date of the comment
updated_datedatetime3The last time comment gets updated

issue_changelogs

This table shows the changelogs of issues. Issues with multiple changelogs are shown as multiple records. This is transformed from Jira or TAPD changelogs.

fieldtypelengthdescriptionkey
idvarchar255The unique id of an issue changelogPK
issue_idvarchar255Issue IDFK_issues.id
author_idvarchar255The id of the user who made the changeFK_accounts.id
author_namevarchar255The id of the user who made the changeFK_accounts.id
field_idvarchar255The id of changed field
field_namevarchar255The id of changed field
original_from_valuevarchar255The original value of the changed field
original_to_valuevarchar255The new value of the changed field
from_valuevarchar255The transformed/standardized original value of the changed field
to_valuevarchar255The transformed/standardized new value of the changed field
created_datedatetime3The creation date of the changelog

issue_worklogs

This table shows the work logged under issues. Usually, an issue has multiple worklogs logged by different developers.

fieldtypelengthdescriptionkey
idvarchar255The id of the worklogPK
author_idvarchar255The id of the author who logged the workFK_acccounts.id
commentlongtext255The comment made while logging the work.
time_spent_minutesintThe time logged. The unit of value is normalized to minute. Eg. 1d =) 480, 4h30m =) 270
logged_datedatetime3The time of this logging action
started_datedatetime3Start time of the worklog
issue_idvarchar255Issue IDFK_issues.id

boards

A board is an issue list or a collection of issues. It's the abstraction of a Jira board, a Jira project, a Github issue list or a GitLab issue list. This table can be used to filter issues by the boards they belong to.

fieldtypelengthdescriptionkey
idvarchar255A board's id is composed of "< plugin >:< Entity >:< PK0 >[:PK1]..."
  • For a Github repo's issue list, the board id is like "< github >:< GithubRepos >:< GithubRepoId >". Eg. "github:GithubRepo:384111310"
  • For a Jira Board, the id is like the board id is like "< jira >:< JiraSourceId >< JiraBoards >:< JiraBoardsId >". Eg. "jira:1:JiraBoards:12"
PK
namevarchar255The name of the board. Note: the board name of a Github project 'merico-dev/lake' is 'merico-dev/lake', representing the default issue list.
descriptionvarchar255The description of the board.
urlvarchar255The url of the board. Eg. https://Github.com/merico-dev/lake
created_datedatetime3Board creation time

board_issues

This table shows the relation between boards and issues. This table can be used to filter issues by board.

fieldtypelengthdescriptionkey
board_idvarchar255Board idFK_boards.id
issue_idvarchar255Issue idFK_issues.id

sprints

A sprint is the abstraction of Jira sprints, TAPD iterations and Github milestones. A sprint contains a list of issues.

fieldtypelengthdescriptionkey
idvarchar255A sprint's id is composed of "< plugin >:< Entity >:< PK0 >[:PK1]..."
  • A sprint in a Github repo is a milestone, the sprint id is like "< github >:< GithubRepos >:< GithubRepoId >:< milestoneNumber >".
    Eg. The id for this sprint is "github:GithubRepo:384111310:5"
  • For a Jira Board, the id is like "< jira >:< JiraSourceId >< JiraBoards >:< JiraBoardsId >".
    Eg. "jira:1:JiraBoards:12"
PK
namevarchar255The name of sprint.
For Github projects, the sprint name is the milestone name. For instance, 'v0.10.0 - Introduce Temporal to DevLake' is the name of this sprint.
urlvarchar255The url of sprint.
statusvarchar255There're 3 statuses of a sprint:
  • CLOSED: a completed sprint
  • ACTIVE: a sprint started but not completed
  • FUTURE: a sprint that has not started
started_datedatetime3The start time of a sprint
ended_datedatetime3The planned/estimated end time of a sprint. It's usually set when planning a sprint.
completed_datedatetime3The actual time to complete a sprint.
original_board_iddatetime3The id of board where the sprint first created. This field is not null only when this entity is transformed from Jira sprintas.
In Jira, sprint and board entities have 2 types of relation:
  • A sprint is created based on a specific board. In this case, board(1):(n)sprint. The original_board_id is used to show the relation.
  • A sprint can be mapped to multiple boards, a board can also show multiple sprints. In this case, board(n):(n)sprint. This relation is shown in table.board_sprints
FK_boards.id

sprint_issues

This table shows the relation between sprints and issues that have been added to sprints. This table can be used to show metrics such as 'ratio of unplanned issues', 'completion rate of sprint issues', etc

fieldtypelengthdescriptionkey
sprint_idvarchar255Sprint idFK_sprints.id
issue_idvarchar255Issue idFK_issues.id
is_removedboolIf the issue is removed from this sprint, then TRUE; else FALSE
added_datedatetime3The time this issue added to the sprint. If an issue is added to a sprint multiple times, the latest time will be the value.
removed_datedatetime3The time this issue gets removed from the sprint. If an issue is removed multiple times, the latest time will be the value.
added_stagevarchar255The stage when issue is added to this sprint. There're 3 possible values:
  • BEFORE_SPRINT
    Planning before sprint starts.
    Condition: sprint_issues.added_date <= sprints.start_date
  • DURING_SPRINT Planning during a sprint.
    Condition: sprints.start_date < sprint_issues.added_date <= sprints.end_date
  • AFTER_SPRINT
    Planing after a sprint. This is caused by improper operation - adding issues to a completed sprint.
    Condition: sprint_issues.added_date ) sprints.end_date
resolved_stagevarchar255The stage when an issue is resolved (issue status turns to 'DONE'). There're 3 possible values:
  • BEFORE_SPRINT
    Condition: issues.resolution_date <= sprints.start_date
  • DURING_SPRINT
    Condition: sprints.start_date < issues.resolution_date <= sprints.end_date
  • AFTER_SPRINT
    Condition: issues.resolution_date ) sprints.end_date

board_sprints

fieldtypelengthdescriptionkey
board_idvarchar255Board idFK_boards.id
sprint_idvarchar255Sprint idFK_sprints.id

Domain 2 - Source Code Management

repos

Information about Github or Gitlab repositories. A repository is always owned by a user.

fieldtypelengthdescriptionkey
idvarchar255A repo's id is composed of "< plugin >:< Entity >:< PK0 >[:PK1]..."
For example, a Github repo's id is like "< github >:< GithubRepos >< GithubRepoId >". Eg. 'github:GithubRepos:384111310'
PK
namevarchar255The name of repo.
descriptionvarchar255The description of repo.
urlvarchar255The url of repo. Eg. https://Github.com/merico-dev/lake
owner_idvarchar255The id of the owner of repoFK_accounts.id
languagevarchar255The major language of repo. Eg. The language for merico-dev/lake is 'Go'
forked_fromvarchar255Empty unless the repo is a fork in which case it contains the id of the repo the repo is forked from.
deletedtinyint2550: repo is active 1: repo has been deleted
created_datedatetime3Repo creation date
updated_datedatetime3Last full update was done for this repo

repo_languages(WIP)

Languages that are used in the repository along with byte counts for all files in those languages. This is in line with how Github calculates language percentages in a repository. Multiple entries can exist per repo.

The table is filled in when the repo has been first inserted on when an update round for all repos is made.

fieldtypelengthdescriptionkey
idvarchar255A repo's id is composed of "< plugin >:< Entity >:< PK0 >[:PK1]..."
For example, a Github repo's id is like "< github >:< GithubRepos >< GithubRepoId >". Eg. 'github:GithubRepos:384111310'
PK
languagevarchar255The language of repo.
These are the languages for merico-dev/lake
bytesintThe byte counts for all files in those languages
created_datedatetime3The field is filled in with the latest timestamp the query for a specific repo_id was done.

repo_commits

The commits belong to the history of a repository. More than one repos can share the same commits if one is a fork of the other.

fieldtypelengthdescriptionkey
repo_idvarchar255Repo idFK_repos.id
commit_shachar40Commit shaFK_commits.sha

refs

A ref is the abstraction of a branch or tag.

fieldtypelengthdescriptionkey
idvarchar255A ref's id is composed of "< plugin >:< Entity >:< PK0 >[:PK1]..."
For example, a Github ref is composed of "github:GithubRepos:< GithubRepoId >:< RefUrl >". Eg. The id of release v5.3.0 of PingCAP/TiDB project is 'github:GithubRepos:384111310:refs/tags/v5.3.0' A repo's id is composed of "< plugin >:< Entity >:< PK0 >[:PK1]..."
PK
ref_namevarchar255The name of ref. Eg. 'refs/tags/v0.9.3'
repo_idvarchar255The id of repo this ref belongs toFK_repos.id
commit_shachar40The commit this ref points to at the time of collection
is_defaultint
  • 0: the ref is the default branch. By the definition of Github, the default branch is the base branch for pull requests and code commits.
  • 1: not the default branch
merge_basechar40The merge base commit of the main ref and the current ref
ref_typevarchar64There're 2 typical types:
  • BRANCH
  • TAG

refs_commits_diffs

This table shows the commits added in a new ref compared to an old ref. This table can be used to support tag-based analysis, for instance, 'No. of commits of a tag', 'No. of merged pull request of a tag', etc.

The records of this table are computed by RefDiff plugin. The computation should be manually triggered after using GitRepoExtractor to collect commits and refs. The algorithm behind is similar to this.

fieldtypelengthdescriptionkey
commit_shachar40One of the added commits in the new ref compared to the old refFK_commits.sha
new_ref_idvarchar255The new ref's id for comparisonFK_refs.id
old_ref_idvarchar255The old ref's id for comparisonFK_refs.id
new_ref_commit_shachar40The commit new ref points to at the time of collection
old_ref_commit_shachar40The commit old ref points to at the time of collection
sorting_indexvarchar255An index for debugging, please skip it

commits

fieldtypelengthdescriptionkey
shachar40One of the added commits in the new ref compared to the old refFK_commits.sha
messagevarchar255Commit message
author_namevarchar255The value is set with command git config user.name xxxxx commit
author_emailvarchar255The value is set with command git config user.email xxxxx author
authored_datedatetime3The date when this commit was originally made
author_idvarchar255The id of commit authorFK_accounts.id
committer_namevarchar255The name of committer
committer_emailvarchar255The email of committer
committed_datedatetime3The last time the commit gets modified.
For example, when rebasing the branch where the commit is in on another branch, the committed_date changes.
committer_idvarchar255The id of committerFK_accounts.id
additionsintAdded lines of code
deletionsintDeleted lines of code
dev_eqintA metric that quantifies the amount of code contribution. The data can be retrieved from AE plugin.

commit_files

The files have been changed via commits.

fieldtypelengthdescriptionkey
idvarchar255The id is composed of "< Commit_sha >:< file_path >"FK_commits.sha
commit_shachar40Commit shaFK_commits.sha
file_pathvarchar255Path of a changed file in a commit
additionsintThe added lines of code in this file by the commit
deletionsintThe deleted lines of code in this file by the commit

components

The components of files extracted from the file paths. This can be used to analyze Git metrics by component.

fieldtypelengthdescriptionkey
repo_idvarchar255The repo idFK_repos.id
namevarchar255The name of component
path_regexvarchar255The regex to extract components from this repo's paths

commit_file_components

The relationship between commit_file and component_name.

fieldtypelengthdescriptionkey
commit_file_idvarchar255The id of commit fileFK_commit_files.id
component_namevarchar255The component name of a file

commit_parents

The parent commit(s) for each commit, as specified by Git.

fieldtypelengthdescriptionkey
commit_shachar40commit shaFK_commits.sha
parentchar40Parent commit shaFK_commits.sha

Domain 3 - Code Review

pull_requests

A pull request is the abstraction of Github pull request and Gitlab merge request.

fieldtypelengthdescriptionkey
idchar40A pull request's id is composed of "< plugin >:< Entity >:< PK0 >[:PK1]..." Eg. For 'github:GithubPullRequests:1347'FK_commits.sha
titlevarchar255The title of pull request
descriptionlongtextThe body/description of pull request
statusvarchar255the status of pull requests. For a Github pull request, the status can either be 'open' or 'closed'.
parent_pr_idvarchar255The id of the parent PR
pull_request_keyvarchar255The key of PR. Eg, 1536 is the key of this PR
base_repo_idvarchar255The repo that will be updated.
head_reop_idvarchar255The repo containing the changes that will be added to the base. If the head repository is NULL, this means that the corresponding project had been deleted when DevLake processed the pull request.
base_refvarchar255The branch name in the base repo that will be updated
head_refvarchar255The branch name in the head repo that contains the changes that will be added to the base
author_namevarchar255The author's name of the pull request
author_idvarchar255The author's id of the pull request
urlvarchar255the web link of the pull request
typevarchar255The work-type of a pull request. For example: feature-development, bug-fix, docs, etc.
The value is transformed from Github pull request labels by configuring GITHUB_PR_TYPE in .env file during installation.
componentvarchar255The component this PR affects.
The value is transformed from Github/Gitlab pull request labels by configuring GITHUB_PR_COMPONENT in .env file during installation.
created_datedatetime3The time PR created.
merged_datedatetime3The time PR gets merged. Null when the PR is not merged.
closed_datedatetime3The time PR closed. Null when the PR is not closed.
merge_commit_shachar40the merge commit of this PR. By the definition of Github, when you click the default Merge pull request option on a pull request on Github, all commits from the feature branch are added to the base branch in a merge commit.
base_commit_shachar40The base commit of this PR.
head_commit_shachar40The head commit of this PR.

pull_request_labels

This table shows the labels of pull request. Multiple entries can exist per pull request. This table can be used to filter pull requests by label name.

fieldtypelengthdescriptionkey
namevarchar255Label name
pull_request_idvarchar255Pull request IDFK_pull_requests.id

pull_request_commits

A commit associated with a pull request

The list is additive. This means if a rebase with commit squashing takes place after the commits of a pull request have been processed, the old commits will not be deleted.

fieldtypelengthdescriptionkey
pull_request_idvarchar255Pull request idFK_pull_requests.id
commit_shachar40Commit shaFK_commits.sha

pull_request_comments

Normal comments, review bodies, reviews' inline comments of GitHub's pull requests or GitLab's merge requests.

fieldtypelengthdescriptionkey
idvarchar255Comment idPK
pull_request_idvarchar255Pull request idFK_pull_requests.id
bodylongtextThe body of the comments
account_idvarchar255The account who made the commentFK_accounts.id
created_datedatetime3Comment creation time
positionintDeprecated
typevarchar255- For normal comments: NORMAL
- For review comments, ie. diff/inline comments: DIFF
- For reviews' body (exist in GitHub but not GitLab): REVIEW
review_idvarchar255Review_id of the comment if the type is REVIEW or DIFF
statusvarchar255Status of the comment

pull_request_events(WIP)

Events of pull requests.

fieldtypelengthdescriptionkey
idvarchar255Event idPK
pull_request_idvarchar255Pull request idFK_pull_requests.id
actionvarchar255The action to be taken, some values:
  • opened: When the pull request has been opened
  • closed: When the pull request has been closed
  • merged: When Github detected that the pull request has been merged. No merges outside Github (i.e. Git based) are reported
  • reoponed: When a pull request is opened after being closed
  • syncrhonize: When new commits are added/removed to the head repository
actor_idvarchar255The account id of the event performerFK_accounts.id
created_datedatetime3Event creation time

Domain 4 - CI/CD(WIP)

jobs

The CI/CD schedule, not a specific task.

fieldtypelengthdescriptionkey
idvarchar255Job idPK
namevarchar255Name of job

builds

A build is an execution of a job.

fieldtypelengthdescriptionkey
idvarchar255Build idPK
job_idvarchar255Id of the job this build belongs toFK_jobs.id
namevarchar255Name of build
duration_secbigintThe duration of build in seconds
started_datedatetime3Started time of the build
statusvarchar255The result of build. The values may be 'success', 'failed', etc.
commit_shachar40The specific commit being built on. Nullable.

cicd_pipelines

A cicd_pipeline is a series of builds that have connections or a standalone build.

fieldtypelengthdescriptionkey
idvarchar255This key is generated based on details from the original pluginPK
created_atdatetime3
updated_atdatetime3
namevarchar255
commit_shavarchar255
branchvarchar255
repovarchar255
resultvarchar100
statusvarchar100
typevarchar100to indicate this is CI or CD
duration_secbigint unsigned
created_datedatetime3
finished_datedatetime3

cicd_pipeline_repos

A map between cic_pipeline and repo info.

fieldtypelengthdescriptionkey
commit_shavarchar255PK
branchvarchar255
repo_urlvarchar255
idvarchar255This key is generated based on details from the original pluginPK
created_atdatetime3
updated_atdatetime3

cicd_tasks

A cicd_task is a single job of ci/cd.

fieldtypelengthdescriptionkey
idvarchar255This key is generated based on details from the original pluginPK
created_atdatetime3
updated_atdatetime3
namevarchar255
pipeline_idvarchar255
resultvarchar100
statusvarchar100
typevarchar100to indicate this is CI or CD
duration_secbigint unsigned
started_datedatetime3
finished_datedatetime3

Cross-Domain Entities

These entities are used to map entities between different domains. They are the key players to break data isolation.

There're low-level entities such as issue_commits, users, and higher-level cross domain entities such as board_repos

issue_commits

A low-level mapping between "issue tracking" and "source code management" domain by mapping issues and commits. Issue(n): Commit(n).

The original connection between these two entities lies in either issue tracking tools like Jira or source code management tools like GitLab. You have to use tools to accomplish this.

For example, a common method to connect Jira issue and GitLab commit is a GitLab plugin Jira Integration. With this plugin, the Jira issue key in the commit message written by the committers will be parsed. Then, the plugin will add the commit urls under this jira issue. Hence, DevLake's Jira plugin can get the related commits (including repo, commit_id, url) of an issue.

fieldtypelengthdescriptionkey
issue_idvarchar255Issue idFK_issues.id
commit_shachar40Commit shaFK_commits.sha

pull_request_issues

This table shows the issues closed by pull requests. It's a medium-level mapping between "issue tracking" and "source code management" domain by mapping issues and commits. Issue(n): Commit(n).

The data is extracted from the body of pull requests conforming to certain regular expression. The regular expression can be defined in GITHUB_PR_BODY_CLOSE_PATTERN in the .env file

fieldtypelengthdescriptionkey
pull_request_idchar40Pull request idFK_pull_requests.id
issue_idvarchar255Issue idFK_issues.id
pull_request_numbervarchar255Pull request key
issue_numbervarchar255Issue key

board_repos (Deprecated)

A way to link "issue tracking" and "source code management" domain by mapping boards and repos. Board(n): Repo(n).

fieldtypelengthdescriptionkey
board_idvarchar255Board idFK_boards.id
repo_idvarchar255Repo idFK_repos.id

accounts

This table stores of user accounts across different tools such as GitHub, Jira, GitLab, etc. This table can be joined to get the metadata of all accounts. metrics, such as 'No. of Issue closed by contributor', 'No. of commits by contributor',

fieldtypelengthdescriptionkey
idvarchar255An account's id is the identifier of the account of a specific tool. It is composed of "< Plugin >:< Entity >:< PK0 >[:PK1]..."
For example, a Github account's id is composed of "< github >:< GithubAccounts >:< GithubUserId >)". Eg. 'github:GithubUsers:14050754'
PK
emailvarchar255Email of the account
full_namevarchar255Full name
user_namevarchar255Username, nickname or Github login of an account
avatar_urlvarchar255
organizationvarchar255User's organization(s)
created_datedatetime3User creation time
statusint0: default, the user is active. 1: the user is not active

users

fieldtypelengthdescriptionkey
idvarchar255id of a personPK
emailvarchar255the primary email of a person
namevarchar255name of a person

user_accounts

fieldtypelengthdescriptionkey
user_idvarchar255users.idComposite PK, FK
account_idvarchar255accounts.idComposite PK, FK

teams

fieldtypelengthdescriptionkey
idvarchar255id from the data sources, decided by DevLake usersPK
namevarchar255name of the team. Eg. team A, team B, etc.
aliasvarchar255alias or abbreviation of a team
parent_idvarchar255teams.id, default to nullFK
sorting_indexint255the field to sort team

team_users

fieldtypelengthdescriptionkey
team_idvarchar255Full name of the team. Eg. team A, team B, etc.Composite PK, FK
user_idvarchar255users.idComposite PK, FK

DWM Entities - (Data Warehouse Middle)

DWM entities are the slight aggregation and operation of DWD to store more organized details or middle-level metrics.

refs_issues_diffs

This table shows the issues fixed by commits added in a new ref compared to an old one. The data is computed from table.ref_commits_diff, table.pull_requests, table.pull_request_commits, and table.pull_request_issues.

This table can support tag-based analysis, for instance, 'No. of bugs closed in a tag'.

fieldtypelengthdescriptionkey
new_ref_idvarchar255The new ref's id for comparisonFK_refs.id
old_ref_idvarchar255The old ref's id for comparisonFK_refs.id
new_ref_commit_shachar40The commit new ref points to at the time of collection
old_ref_commit_shachar40The commit old ref points to at the time of collection
issue_numbervarchar255Issue number
issue_idvarchar255Issue idFK_issues.id