1 Star 0 Fork 13

tomdev / 图数据库系统-gStore

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
TEST.md 2.44 KB
一键复制 编辑 原始数据 按行查看 历史

The formal experiment report is in EXPERIMENT.

Preparation

We have compared the performance of gStore with several other database systems, such as Jena, Sesame, Virtuoso and so on. Contents to be compared are the time to build database, the size of the built database, the time to answer single SPARQL query and the matching case of single query's results. In addition, if the memory cost is very large(>20G), we will record the memory cost when running these database systems.(not accurate, just for your reference)

To ensure all database systems can run correctly on all datasets and queries, the format of datasets must be supported by all database systems and the queries should not contain update operations, aggregate operations and operations related with uncertain predicates. Notice that when measuring the time to answer queries, the time of loading database index should not be included. To ensure this principle, we load the database index first for some database systems, and warm up several times for others.

Datasets used here are WatDiv, Lubm, Bsbm and DBpedia. Some of them are provided by websites, and others are generated by algorithms. Queries are generated by algorithms or written by us.

The experiment environment is a CentOS server, whose memory size is 82G and disk size is 7T. We use full_test to do this test.

Result

This program produces many logs placed in result.log/, load.log/ and time.log/. You can see that all results of all queries are matched by viewing files in result.log/, and the time cost and space cost of gStore to build database are larger than others by viewing files in load.log/. More precisely, there is an order of magnitude difference between gStore and others in the time/space cost of building database.

Through analysing time.log/, we can find that gStore behave better than others on very complicated queries(many variables, circles, etc). For other simple queries, there is not much difference between the time of these database systems.

Generally speaking, the memory cost of gStore when answering queries is higher than others. More complicated the query is and more large the dataset is, more apparent the phenomenon is.

You can find more detailed information in test report. Notice that some questions in the test report have already be solved now.

C++
1
https://gitee.com/tomdev/gStore.git
git@gitee.com:tomdev/gStore.git
tomdev
gStore
图数据库系统-gStore
master

搜索帮助