为什么80%的码农都做不了架构师?>>>
es基于lucene实现。https://www.elastic.co/guide/cn/elasticsearch/guide/current/index.html
es文档元数据
一个文档不三个必须的元数据元素如下:
_index
文档在哪存放。一个 索引 应该是因共同的特性被分组到一起的文档集合。
_type
文档表示的对象类别。索引子分区。
_id
文档唯一标识。可以用户自定义或者自动生成.自动生成的 ID 是 URL-safe、 基于 Base64 编码且长度为20个字符的 GUID 字符串。 这些 GUID 字符串由可修改的 FlakeID 模式生成,这种模式允许多个节点并行生成唯一 ID ,且互相之间的冲突概率几乎为零。
非必须的元素:
_version,记录文档版本号。在 Elasticsearch 中每个文档都有一个版本号。当每次对文档进行修改时(包括删除), _version 的值会递增。
es查询结果标识
took:查询话费时间;
shards:查询过程参与的分片数。
timeout:是否超时,timeout时间可以设置:
GET /_search?timeout=10ms
hits:
total:它 包含 total 字段来表示匹配到的文档总数,并且一个 hits 数组包含所查询结果的前十个文档。
每个结果还有一个 _score ,它衡量了文档与查询的匹配程度。默认情况下,首先返回最相关的文档结果,就是说,返回的文档是按照 _score 降序排列的。在这个例子中,我们没有指定任何查询,故所有的文档具有相同的相关性,因此对所有的结果而言 1 是中性的 _score 。max_score 值是与查询所匹配文档的 _score 的最大值。
t@ubuntu:~$ curl -XGET 'localhost:9200/_search?pretty'
{"took" : 759,"timed_out" : false,"_shards" : {"total" : 5,"successful" : 5,"failed" : 0},"hits" : {"total" : 3,"max_score" : 1.0,"hits" : [{"_index" : "website","_type" : "blog","_id" : "123","_score" : 1.0,"_source" : {"title" : "My first blog entry","text" : "I am starting to get the hang of this...","date" : "2014/01/02"}},{"_index" : "website","_type" : "blog","_id" : "AV6ZDic0Gtb1Pf5XS4Nu","_score" : 1.0,"_source" : {"title" : "My second blog entry","text" : "Still trying this out...","date" : "2014/01/01"}},{"_index" : "website","_type" : "blog","_id" : "1","_score" : 1.0,"_source" : {"title" : "My first blog entry","text" : "Starting to get the hang of this...","views" : 2,"tags" : ["testing"]}}]}
}
es增删改查
1,保存文档
PUT /website/blog/123
{"title": "My first blog entry","text": "Just trying this out...","date": "2014/01/01"
}
2,查询文档
根据id查询
GET /website/blog/123?pretty
查询部分字段,仅查询title和text字段
GET /website/blog/123?_source=title,text
查询结果不需要元数据
GET /website/blog/123/_source
查询文档是否存在
curl -i -XHEAD http://localhost:9200/website/blog/123
存在:200
不存在:404
搜索返回指定字段
http://ip:9200/index/type/_search?_source=createTime
查询多个文档:
GET /_mget
{"docs" : [{"_index" : "website","_type" : "blog","_id" : 2},{"_index" : "website","_type" : "pageviews","_id" : 1,"_source": "views"}]
}
多索引,多类型查询:
/_search
在所有的索引中搜索所有的类型
/gb/_search
在 gb 索引中搜索所有的类型
/gb,us/_search
在 gb 和 us 索引中搜索所有的文档
/g*,u*/_search
在任何以 g 或者 u 开头的索引中搜索所有的类型
/gb/user/_search
在 gb 索引中搜索 user 类型
/gb,us/user,tweet/_search
在 gb 和 us 索引中搜索 user 和 tweet 类型
/_all/user,tweet/_search
在所有的索引中搜索 user 和 tweet 类型
分页:
GET /_search?size=5&from=10
http://localhost:9200/ct_ws/type/_search?sort=createTime:desc&pretty&size=20000&from=0&_source=createTime,url
3,修改文档
根据id,再传一次文档就好,version值会自动递增
PUT /website/blog/123
{"title": "My first blog entry","text": "I am starting to get the hang of this...","date": "2014/01/02"
}
t@ubuntu:~$ curl -XPUT 'localhost:9200/website/blog/123?pretty' -H 'Content-Type: application/json' -d'
> {
> "title": "My first blog entry",
> "text": "I am starting to get the hang of this...",
> "date": "2014/01/02"
> }
> '
{"_index" : "website","_type" : "blog","_id" : "123","_version" : 2,"result" : "updated","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"created" : false
}
在es内部,旧文档会被删除,新文档重新索引 ;
根据id新增字段
POST /website/blog/1/_update
{"doc" : {"tags" : [ "testing" ],"views": 0}
}
根据id修改字段,让views值增1
POST /website/blog/1/_update
{"script" : "ctx._source.views+=1"
}
eg:
t@ubuntu:~$ curl -XPOST 'localhost:9200/website/blog/1/_update?pretty' -H 'Content-Type: application/json' -d'
{"script" : "ctx._source.views+=1"
}
'
{"_index" : "website","_type" : "blog","_id" : "1","_version" : 5,"result" : "updated","_shards" : {"total" : 2,"successful" : 1,"failed" : 0}
}
t@ubuntu:~$ curl -XGET 'localhost:9200/website/blog/1?pretty'
{"_index" : "website","_type" : "blog","_id" : "1","_version" : 5,"found" : true,"_source" : {"title" : "My first blog entry","text" : "Starting to get the hang of this...","views" : 2,"tags" : ["testing"]}
}
t@ubuntu:~$
4,删除文档
删除文档不会立即将文档从磁盘中删除,只是将文档标记为已删除状态
根据id删除文档
DELETE /website/blog/123
批量操作
POST /_bulk
{ "delete": { "_index": "website", "_type": "blog", "_id": "123" }}
{ "create": { "_index": "website", "_type": "blog", "_id": "123" }}
{ "title": "My first blog post" }
{ "index": { "_index": "website", "_type": "blog" }}
{ "title": "My second blog post" }
{ "update": { "_index": "website", "_type": "blog", "_id": "123", "_retry_on_conflict" : 3} }
{ "doc" : {"title" : "My updated blog post"} }
清空索引数据,类似于mysql的drop talble操作
curl -XPOST 'http://ip:9600/megacorp/employee/_delete_by_query' -H 'Content-Type: application/json' -d'
{"query": {"match_all": {}}
}
'
并发控制
es通过version对并发读写进行控制:
对已经创建的文档,只有versin=1时才修改
PUT /website/blog/1?version=1
{"title": "My first blog entry","text": "Starting to get the hang of this..."
}
如果version值不一致,会引发报错:
t@ubuntu:~$
t@ubuntu:~$ curl -XPUT 'localhost:9200/website/blog/1?version=1&pretty' -H 'Content-Type: application/json' -d'
> {
> "title": "My first blog entry",
> "text": "Starting to get the hang of this..."
> }
> '
{"error" : {"root_cause" : [{"type" : "version_conflict_engine_exception","reason" : "[blog][1]: version conflict, current version [2] is different than the one provided [1]","index_uuid" : "aGPDbTmcTjKyhQ4fEJ6NEw","shard" : "3","index" : "website"}],"type" : "version_conflict_engine_exception","reason" : "[blog][1]: version conflict, current version [2] is different than the one provided [1]","index_uuid" : "aGPDbTmcTjKyhQ4fEJ6NEw","shard" : "3","index" : "website"},"status" : 409
}
t@ubuntu:~$
使用外部版本号:
PUT /website/blog/2?version=5&version_type=external
{"title": "My first external blog entry","text": "Starting to get the hang of this..."
}