Solr與MySQL查詢效能對比

測試資料量：10407608

num docs: 10407608

在專案中乙個最常用的查詢，查詢某段時間內的資料，sql查詢獲取資料，30s左右

select * from `tf_hotspotdata_copy_test` where collecttime between '2014-12-06 00:00:00' and '2014-12-10 21:31:55';

對collecttime建立索引後，同樣的查詢，2s，快了很多。

solr索引

solr查詢，同樣的條件，72ms

"status": 0,

"qtime": 72,

"params":

xsort

false

query

使用/export需要字段使用docvalues建立索引：

使用docvalues必須要有乙個用來sort的字段，且只支援下列型別：

sort fields must be one of the following types: int,float,long,double,string

docvalues支援的返回字段：

export fields must either be one of the following types: int,float,long,double,string

使用solrj來查詢並獲取資料：

solrquery params = new solrquery();
params.set("q", timequerystring);
params.set("fq", querystring);
params.set("start", 0);
params.set("rows", integer.max_value);
params.set("sort", "id asc");
params.sethighlight(false);
params.set("qt", "/export");
params.setfields(retkeys);
queryresponse response = server.query(params);

乙個bug：

solrj沒法正確解析出結果集，看了下原始碼，原因是solr server返回的contenttype和solrj解析時檢查時不一致，solrj的binaryresponseparser這個content_type是定死的：

public class binaryresponseparser extends responseparser

時間對比：

查詢條件（統計）

時間mysql（無索引）

33smysql（有索引）

14ssolrj（facet查詢）

0.54s

如果我們要查詢某台裝置在某個時間段上按「時」、「周」、「月」、「年」進行資料統計，solr也是很方便的，比如以下按天統計裝置號為1013上的資料：

string starttime = "2014-12-06 00:00:00";
string endtime = "2014-12-16 21:31:55"; 
solrquery query = new solrquery();
query.set("q", "deviceid:1013");
query.setfacet(true);
date start = dateformathelper.tosolrsearchdate(dateformathelper.stringtodate(starttime));
date end = dateformathelper.tosolrsearchdate(dateformathelper.stringtodate(endtime));
query.adddaterangefacet("collecttime", start, end, "+1day");
queryresponse response = server.query(query);
listdatefacetfields = response.getfacetranges();
for (rangefacet facetfield : datefacetfields
}

2014-12-06t00:00:00z: 58

2014-12-07t00:00:00z: 0

2014-12-08t00:00:00z: 0

2014-12-09t00:00:00z: 0

2014-12-10t00:00:00z: 3707

2014-12-11t00:00:00z: 8384

2014-12-12t00:00:00z: 7803

2014-12-13t00:00:00z: 2469

2014-12-14t00:00:00z: 142

2014-12-15t00:00:00z: 34

2014-12-16t00:00:00z: 0

time: 662

水平拆分表：

由於本系統採集到的大量資料和「時間」有很大關係，一些業務需求根據「時間」來查詢也比較多，可以按「時間」字段進行拆分表，比如按每月一張表來拆分，但是這樣做應用層**就需要做更多的事情，一些跨表的查詢也需要更多的工作。綜合考慮了表拆分和使用solr來做索引查詢的工作量後，還是採用了solr。

總結：在mysql的基礎上，配合lucene、solr、elasticsearch等搜尋引擎，可以提高類似全文檢索、分類統計等查詢效能。

參考：

Solr與MySQL查詢效能對比

ADO與EF效能對比

fastJson與jackson效能對比

MySQL批量更新（下效能對比

Solr與MySQL查詢效能對比

ADO與EF效能對比

fastJson與jackson效能對比

MySQL批量更新（下 效能對比

相關推薦

MySQL批量更新（下效能對比