2020-04-12 08:17 门头沟学院 Java

关注

Elasticsearch快速入门案例电商商品管理

资料

在任意的查询字符串中增加pretty参数，会让Elasticsearch美化输出(pretty-print)JSON响应以便更加容易阅读。

集群健康检查，文档CRUD

document数据格式

面向文档的搜索分析引擎

应用系统的数据结构都是面向对象的，复杂的
对象数据存储到数据库中，只能拆解开来，变为扁平的多张表，每次查询的时候还得还原回对象格式，相当麻烦
ES是面向文档的，文档中存储的数据结构，与面向对象的数据结构是一样的，基于这种文档数据结构，es可以提供复杂的索引，全文检索，分析聚合等功能【不用像关系型数据库的多张表】
es的document用json数据格式来表达【多张表-->一个文档】

快速检查集群的健康状况

es提供了一套api，叫做cat api，可以查看es中各种各样的数据

GET cat/health?v 【加v可以显示列头】
```
***tamp cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1586565740 08:42:20  elasticsearch yellow          1         1      1   1    0    0        1             0                  -                 50.0%
```
集群会自动加入也就是只要运行elasticsearch.bat 【需要再下载一个包不然会干预到原本的包】
```
***tamp cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1586566250 08:50:50  elasticsearch green           2         2      2   1    0    0        0             0                  -                100.0%
```
我们可以看到yellow ---> green active_shards_percent -->100%
green：每个索引的primary shard和replica shard都是active状态的
yellow：每个索引的primary shard都是active状态的，但是部分replica shard不是active状态，处于不可用的状态
red：不是所有索引的primary shard都是active状态的，部分索引有数据丢失了

此时只要启动第二个es进程，就会在es集群中有2个node，然后那1个replica shard就会自动分配过去，然后cluster status就会变成green状态。

为什么现在会处于一个yellow状态？

一台电脑启动了一个es进程，相当于就只有一个node。现在es中有一个index，就是kibana自己内置建立的index。
默认的配置是给每个index分配5个primary shard和5个replica shard，而且primary shard和replica shard不能在同一台机器上（为了容错）。
现在kibana自己建立的index是1个primary shard和1个replica shard。
当前就一个node，所以只有1个primary shard被分配了和启动了，但是一个replica shard没有第二台机器去启动。

快速查看集群中有哪些索引

GET _cat/indices?v

health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   .kibana YzyQ2vcfQcSknHNSoAGOFA   1   1          1            0      3.1kb          3.1kb

简单的索引操作创建索引：PUT /test_index?pretty
通过GET _cat/indices?v查看

health status index      uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   test_index UiHIV1hiQGqiny7r8sqnaw   5   1          0            0       650b           650b
yellow open   .kibana    YzyQ2vcfQcSknHNSoAGOFA   1   1          1            0      3.1kb          3.1kb

简单的索引操作删除索引：DELETE /test_index?pretty

商品的CRUD操作

增新增商品：新增文档，建立索引

PUT /person/student/1
{
"name":"cznczai",
"age":20
}
||
\/
PUT后产生的数据
{
"_index": "person",    
"_type": "student",
"_id": "1",              <----
"_version": 1,
"result": "created",
"_shards": {
  "total": 2,
  "successful": 1,
  "failed": 0
},
"created": true
}
--------------------------------------
PUT /person/student/2
{
"name":"wcl",
"age":20
}
||
\/
{
"_index": "person",
"_type": "student",
"_id": "2",              <----
"_version": 1,
"result": "created",
"_shards": {
  "total": 2,
  "successful": 1,
  "failed": 0
},
"created": true
}

查

GET /person/student/1
||
\/
{
"_index": "person",
"_type": "student",
"_id": "1",              <---也就是put进去的id 添加19 id 也是19
"_version": 1,
"found": true,
"_source": {
  "name": "cznczai",
  "age": 20
}
}

es会自动建立index和type，不需要提前创建，而且es默认会对document每个field都建立倒排索引，让其可以被搜索

删
DELETE /person/student/1

{
"found": true,
"_index": "person",
"_type": "student",
"_id": "1",
"_version": 22,
"result": "deleted",   ---->   "result": "not_found",
"_shards": {
  "total": 2,
  "successful": 1,
  "failed": 0
}
}

改
换

PUT /ecommerce/product/1
{
  "name" : "jiaqiangban gaolujie yagao"
}
替换方式有一个不好，即使必须带上所有的field，才能去进行信息的修改
否则变成了覆盖

需要放在doc里面才可以使用post

POST /person/student/1/_update
{
"doc": { 
  "name":"cznczai"
}
}
||
\/
{
"_index": "person",
"_type": "student",
"_id": "1",
"_version": 3,
"result": "updated",
"_shards": {
  "total": 2,
  "successful": 1,
  "failed": 0
},
"created": false
}

多种搜索方式

query string search

搜索全部的商品： GET /person/student/_search

{
"took": 60,       耗费了几毫秒
"timed_out": false,    是否超时，这里是没有
"_shards": {      数据拆成了5个分片，所以对于搜索请求，会打到所有的primary shard（或者是它的某个replica shard也可以）
  "total": 5,
  "successful": 5,
  "failed": 0
},
"hits": {
  "total": 3, 查询结果的数量，3个document
  "max_score": 1,   score的含义，就是document对于一个search的相关度的匹配分数，越相关，就越匹配，分数也高
  "hits": [{..index数据.},{.index数据..},{..index数据.}]  包含了匹配搜索的document的详细数据
  }
}

query string search的由来，因为search参数都是以http请求的query string来附带的
GET /person/student/_search?q=name:wcl&sort=age:desc

"hits": {
  "total": 2,
  "max_score": null,
  "hits": [
    {
      "_index": "person",
      "_type": "student",
      "_id": "17",
      "_score": null,
      "_source": {
        "name": "wcl",
        "age": 20
      },
      "sort": [
        20
      ]
    },
    {
      "_index": "person",
      "_type": "student",
      "_id": "2",
      "_score": null,
      "_source": {
        "name": "wcl",
        "age": 19
      },
      "sort": [
        19
      ]
    }
  ]
}

但是如果查询请求很复杂，是很难去构建的
在生产环境中，几乎很少使用query string search

query DSL

DSL：Domain Specified Language，特定领域的语言
http request body：请求体，可以用json的格式来构建查询语法，比较方便，可以构建各种复杂的语法，比query string search肯定强大多了

查询所有的商品

GET /person/student/_search
{
"query": {
  "match_all": {}
},
"sort": [
  {
    "age": "desc"
  }
]
}

查询名字为cznczai，并按照年龄降序排序

GET /person/student/_search
{
"query": {
  "match": {
    "name": "cznczai"
  }
},
"sort": [
  {
    "age": "desc"
  }
]
}
||
\/
"hits": {
  "total": 1,
  "max_score": null,
  "hits": [
    {
      "_index": "person",
      "_type": "student",
      "_id": "1",
      "_score": null,
      "_source": {
        "name": "cznczai",
        "age": 21
      },
      "sort": [
        21
      ]
    }
  ]
}

从哪里开始查找查找数量有多少个

GET /person/student/_search
{
"query": {
  "match_all": {}
},
"from": 1, 
"size": 1, 
"sort": [
  {
    "age": "desc"
  }
]
}

只显示数据的名称即可

GET /person/student/_search
{
"query": {
  "match_all": {}
},
"from": 1, 
"size": 1, 
"_source": ["name"], 
"sort": [
  {
    "age": "desc"
  }
]
}

更加适合生产环境的使用，可以构建复杂的查询

query filter

搜索数据名称包含wcl，而且年龄大于等于10，小于等于19元的数据

GET /person/student/_search
{
"query": {
  "bool": {
    "must": [
      {
        "match": {
          "name": "wcl"
        }
      }
    ], 
    "filter": {
      "range": {
        "age": {
          "gte": 10,
          "lte": 19
        }
      }
    }
  }
},
"sort": [
  {
    "age": "desc"
  }
]
}

full-text search

全文索引的搜索代码

POST /person/student/2/_update
{
"doc": {
  "name":"czn cznczai cgc"
}
}
GET /person/student/_search
{
"query": {
  "match": {
    "name": "cgc wcl"  不需要同时包含 cgc wcl
  }
}
}
||
\/   不通过其他进行排序  我们可以看到max_score的信息
"hits": {
  "total": 3,
  "max_score": 0.2876821,
  "hits": [
    {
...
      "_score": 0.2876821,  
        "name": "wcl",   
...
      }
    },
    {
...
      "_score": 0.25316024,
        "name": "czn cznczai cgc",
...
      }
    },
    {
...
      "_score": 0.25316024,
        "name": "wcl czn cznczai",
...
      }
    }
  ]
}

phrase search

跟全文检索相对应，相反，全文检索会将输入的搜索串拆解开来，去倒排索引里面去一一匹配，只要能匹配上任意一个拆解后的单词，就可以作为结果返回
phrase search，要求输入的搜索串，必须在指定的字段文本中，完全包含一模一样的，才可以算匹配，才能作为结果返回

查询代码

GET /person/student/_search
{
"query": {
  "match_phrase": {
    "name": "wcl czn"   <--也就是需要同时包含 wcl czn
  }
}
}

highlight search

高亮会对数据进行处理

GET /person/student/_search
{
"query": {
  "match": {
    "name": "czn"
  }
},
"highlight": {
  "fields": {
    "name":{}      
  } 
}
}
||
\/
"highlight": {
        "name": [
          "《em>czn《/em> cznczai cgc"
        ]
      }
...