Go To My HomePage

solr的应用

一、Solr安装包主要目录布局

solr
  |
  +- bin
  |	 |
  |	 +- init.d【linux中制作solr服务的配置文件】
  |	 |
  |	 +- solr和solr.cmd【solr在windows和linux下的启动脚本】
  |	 |
  |	 +- post【与solr沟通的简单命令行窗口】
  |	 |
  |	 +- solr.in.sh和solr.in.cmd【启动solr是配置的一些参数和检查项】
  |	 |
  |	 +- install_solr_services.sh【linux下安装solr服务的脚本】
  |
  +- contrib【插件目录】
  |
  +- dist【solr的jar包】
  |
  +- example【示例】
  |
  +- server【服务目录】
     |
     +- solr-webapp【UI界面】 
     |
     +- lib【Jetty的lib包】
     |
     +- logs和resources【日志和日志配置】
     |
     +- solr【示例配置】

二、Solr安装(Tomcat)

前置条件:已安装tomcat、java

第一步、解压solr

​ unzip -q solr-7.5.0.zip

第二步、拷贝文件到指定位置

源(Solr文件夹) 目标(Tomcat文件夹)
server/solr-webapp/webapp 文件夹 webapps/solr(复制并改名为solr)
server/lib/ext*.jar 文件 webapps/solr/WEB-INF/lib*.jar 文件
server/lib/metrics*.jar 文件 webapps/solr/WEB-INF/lib*.jar 文件
dist/solr-dataimporthandler-*.jar 文件 webapps/solr/WEB-INF/lib*.jar 文件
server/resources/log4j2.xml webapps/solr/WEB-INF/classes(需创建classes文件夹)
server/solr 文件夹 拷贝并命名为solr_home(不再tomcat文件夹下)

第三步、配置solr

  • 编辑web.xml文件,设置【solr/home】项,指向创建的solr_home文件夹

    <env-entry>
      <env-entry-name>solr/home</env-entry-name>
      <env-entry-value>solr_home的目录位置</env-entry-value>
      <env-entry-type>java.lang.String</env-entry-type>
    </env-entry>
    
  • 配置安全性约束【security-constraint】项,注释相关内容。

  • 配置[【catalina.sh】,加入日志位置,例如

    JAVA_OPTS=${JAVA_OPTS}" -Dsolr.log.dir=/opt/solr_home/log "
    

三、schema和solrconfig设计

Field Type主要属性

属性名 作用
name fieldType的名称,该值在字段定义中的type属性中使用
class 用于存储和索引此类型数据的类名,如果是第三方类名需要使用全限定名
positionIncrementGap 对于多值字段,指定多个值之间的距离,以防止虚假短语匹配
autoGeneratePhraseQueries 对于文本字段。如果为true,Solr自动为相邻术语生成短语查询。如果false,术语必须用双引号括起来作为短语。
synonymQueryStyle 考虑使用查询时同义词搜索
docValuesFormat 定义DocValuesFormat用于此类字段的自定义。这需要一个支持模式的编解码器

字段定义,Field 默认属性

属性 描述 默认值
indexed 是否索引 true
stored 是否存储 true
docValues 字段的值是否将放在面向列的DocValues结构,用于排序操作 false
sortMissingFirst sortMissingLast 没有该field的数据排在有该field的数据之后/前,不论请求时的排序规则 false
multiValued 此字段是否存在多值行为 false
omitNorms 字段的长度不影响得分和在索引时不做boost *
omitTermFreqAndPositions 忽略匹配的Term的位置和出现频率分数 *
omitPositions 忽略匹配的Term的位置分数 *
termVectors termPositions termOffsets termPayloads FastVectorQuery使用,存储Trem向量信息用于快速查询,使用空间换得时间 false
required 此字段必须存在 false
useDocValuesAsStored 存储docValues true
large 大字段始终采用延迟加载,实际值<512KB只占用文档高速缓存空间。此选项需要stored="true"multiValued="false"支持 false

schema.xml文件含义及配置查看详情

solrconfig.xml文件含义及配置查看详情

枚举类型的定义

第一步:schema文件中定义枚举类型

<field name="sex" type="sex_enum" indexed="true" stored="true" multiValued="false" />

<fieldType name="sex_enum" class="solr.EnumFieldType" docValues="true" enumsConfig="enumsConfig.xml" enumName="sex"/>

第二步:配置【enumsConfig.xml】文件,如果存储索引值不在文件中存在将会报错

<?xml version="1.0" ?>
<enumsConfig>
  <enum name="sex">
    <value>M</value>
    <value>F</value>
  </enum>
</enumsConfig>

四、IK分词器的应用

以下以IK7.5为例,IK分词器的链接地址

  1. 将jar包放入solr服务的jetty或tomcat的webapp/WEB-INF/lib/目录下

  2. 将resources目录下的5个配置文件放入solr服务的jetty或tomcat的webapp/WEB-INF/classes/目录下

    IKAnalyzer.cfg.xml
    ext.dic
    stopword.dic
    ik.conf
    dynamicdic.txt
    
  3. 配置solr的schema.xml,添加ik分词器

    <!-- ik分词器 -->
    <fieldType name="text_ik" class="solr.TextField">
      <analyzer type="index">
        <tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false" conf="ik.conf"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true" conf="ik.conf"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>
    
  4. ik.conf文件说明

    #动态字典列表,可以设置多个字典表,用逗号进行分隔,默认动态字典表为dynamicdic.txt
    files=dynamicdic.txt
    #默认值为0,每次对动态字典表修改后请+1,不然不会将字典表中新的词语添加到内存中
    lastupdate=0
    

五、solrj功能的使用

schema.xml中定义的数据模型

<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="_version_" type="plong" indexed="false" stored="false"/>
<field name="_root_" type="string" indexed="true" stored="false" docValues="false" />
<field name="_text_" type="text_general" indexed="true" stored="false" multiValued="true"/>
<field name="issn" type="string" indexed="true" stored="true" multiValued="false" />
<field name="name" type="string" indexed="true" stored="true" multiValued="false"/>
<field name="price" type="pdouble" indexed="true" stored="true" multiValued="false"/>
<field name="pubshingTime" type="pdate" indexed="true" stored="true" multiValued="false"/>
<field name="authors" type="string" indexed="true" stored="true" multiValued="true"/>
<field name="classifyCode" type="string" indexed="true" stored="true" multiValued="false"/>
<field name="hasMarc" type="boolean" indexed="true" stored="true" multiValued="false"/>

1、创建/修改索引

方式一:使用字符串指定字段名

@Test
public void inputDoc () throws Exception {
  SolrClient sClient = getSolrClient();
  for (Book book : DataUtil.getBookList()) {
    SolrInputDocument doc = new SolrInputDocument();
    doc.addField("id", book.getId());
    doc.addField("issn", book.getIssn());
    doc.addField("name", book.getName());
    doc.addField("price", book.getPrice());
    doc.addField("pubshingTime", book.getPubshingTime());
    doc.addField("authors", book.getAuthors());
    doc.addField("classifyCode", book.getClassify().getCode());
    doc.addField("hasMarc", book.getHasMarc());
    sClient.add(doc);
  }
  sClient.commit();
  sClient.close();
}

方式二:使用javaBean

  • 创建模型

    //使用@Field注解标志solr对应的字段名
    //ClassifyCode属性需要额外注意,此种情况下只能在setter方法上标注,切记!
    public class Book {
      @Field private String id;
      
      @Field private String name;
      
      @Field private String issn;
      
      @Field private List<String> authors;
      
      @Field private double price;
      
      private Classify classify;
      
      @Field private Date pubshingTime;
      
      @Field private Boolean hasMarc;
      
      public String getClassifyCode() {
        return this.classify == null ? null : this.classify.getCode();
      }
      
      @Field
      public void setClassifyCode(String classifyCode) {
        this.classify = (this.classify == null ? new Classify() : this.classify);
        this.classify.setCode(classifyCode);
      }
      //其他的getter/setter
    }
    
  • 创建索引

    @Test
    public void inputDocByBean() throws Exception {
      SolrClient sClient = getSolrClient();
      sClient.addBeans(DataUtil.getBookList());
      sClient.commit();
      sClient.close();
    }
    

2、删除索引

//通过查询删除
@Test
public void deleteDocByQuery() throws Exception {
  SolrClient sClient = getSolrClient();
  sClient.deleteByQuery("*:*");
  sClient.commit();
  sClient.close();
}

//通过ID删除
@Test
public void deleteDocById() throws Exception {
  SolrClient sClient = getSolrClient();
  sClient.deleteById("115373065864171873783");
  sClient.commit();
  sClient.close();
}

小技巧:在solr管控台界面可以在【Documents】选项下选择“Document Type>XML”使用以下xml格式进行删除

<!-- 方式一:通过查询删除 -->
<delete>
  <query>*:*</query>
</delete>
<commit/>

<!-- 方式二:指定ID删除 -->
<delete>
  <id>【id值】</id>
</delete>
<commit/>

3、主要查询功能

  • 按照id查询

    @Test
    public void testSearchById() throws Exception {
      SolrClient solrClient = getSolrClient();
      //使用id查询
      SolrDocument solrDocument = solrClient.getById("115373065864171873622");
      //转换为bean
      DocumentObjectBinder documentObjectBinder = new DocumentObjectBinder();
      Book book = documentObjectBinder.getBean(Book.class, solrDocument);
      System.out.println(book);
      solrClient.close();
    }
    
  • 按照查询语句查询

    @Test
    public void testSearchByQuery() throws Exception {
      SolrClient solrClient = getSolrClient();
      SolrQuery query = new SolrQuery("*:*");
      query.setStart(2);  //分页,开始索引。0为基数
      query.setRows(2);   //总共搜索条数
      QueryResponse response = solrClient.query(query);
      //response.getResults()返回SolrDocumentList对象,需要自己封装
      List<Book> books = response.getBeans(Book.class);  //直接拿到bean对象
      long numFound = response.getResults().getNumFound();
      long start = response.getResults().getStart();
      System.out.println("总记录数:"+numFound);
      System.out.println("开始记录:"+start);
      books.forEach(System.out::println);
    }
    
  • 数字范围查询

    @Test
    public void testNumRangeQuery() throws Exception {
      SolrClient solrClient = getSolrClient();
      SolrQuery query = new SolrQuery(" price:[35 TO 45] " + SimpleParams.AND_OPERATOR +" classifyCode:D ");
      QueryResponse response = solrClient.query(query);
      List<Book> books = response.getBeans(Book.class);
      long numFound = response.getResults().getNumFound();
      long start = response.getResults().getStart();
      System.out.println("总记录数:"+numFound);
      System.out.println("开始记录:"+start);
      books.forEach(System.out::println);
    }
    
  • 过滤查询【使用普通查询即AND连接会影响评分,过滤查询不会影响评分】

    @Test
    public void testFilterQuery() throws Exception {
      SolrClient solrClient = getSolrClient();
      SolrQuery query = new SolrQuery("*:*");
      //过滤的条件
      query.addFilterQuery("classifyCode:D");
      QueryResponse response = solrClient.query(query);
      List<Book> books = response.getBeans(Book.class);
      long numFound = response.getResults().getNumFound();
      long start = response.getResults().getStart();
      System.out.println("总记录数:"+numFound);
      System.out.println("开始记录:"+start);
      books.forEach(System.out::println);
    }
    
  • 高亮查询

    • 一般高亮查询
    @Test
    public void testHighlightQuery() throws Exception {
      SolrClient solrClient = getSolrClient();
      SolrQuery query = new SolrQuery("name:*政府*");
      //开启高亮
      query.setHighlight(true);
      //设置高亮前缀与后缀
      query.setHighlightSimplePre("<front color='read'>");
      query.setHighlightSimplePost("</front>");
      //设置高亮字段名
      query.addHighlightField("name");
      QueryResponse response = solrClient.query(query);
      List<Book> books = response.getBeans(Book.class);
      //获取高亮属性
      Map<String, Map<String, List<String>>> hMap = response.getHighlighting();
      //将高亮属性设置到相对应属性
      books.forEach((b) -> {
        Map<String, List<String>> mapField = hMap.get(b.getId());
        String hName =  mapField.get("name") == null ? b.getName() : mapField.get("name").get(0);
        b.setName(hName);
      });
      long numFound = response.getResults().getNumFound();
      long start = response.getResults().getStart();
      System.out.println("总记录数:"+numFound);
      System.out.println("开始记录:"+start);
      books.forEach(System.out::println);
    }
    
    • FastVectorHighlighter查询
    @Test
    public void testFastHighlightQuery() throws Exception {
      SolrClient solrClient = getSolrClient();
      SolrQuery query = new SolrQuery("name:政府");
      //开启高亮
      query.setHighlight(true);
      query.add("hl.method","fastVector");                        //使用fastVector
      query.add(HighlightParams.TAG_PRE,"<front color='read'>");  //和普通的前缀不一样
      query.add(HighlightParams.TAG_POST,"</front>");             //和普通的后缀不一样
      query.addHighlightField("name");
      QueryResponse response = solrClient.query(query);
      List<Book> books = response.getBeans(Book.class);
      //获取高亮属性
      Map<String, Map<String, List<String>>> hMap = response.getHighlighting();
      //将高亮属性设置到相对应属性
      books.forEach((b) -> {
        Map<String, List<String>> mapField = hMap.get(b.getId());
        String hName =  mapField.get("name") == null ? b.getName() : mapField.get("name").get(0);
        b.setName(hName);
      });
      long numFound = response.getResults().getNumFound();
      long start = response.getResults().getStart();
      System.out.println("总记录数:"+numFound);
      System.out.println("开始记录:"+start);
      books.forEach(System.out::println);
    }
    
  • 分组查询

    @Test
    public void testGroupQuery() throws Exception {
      SolrClient solrClient = getSolrClient();
      SolrQuery query = new SolrQuery("*:*");
      //分组参数
      query.add(GroupParams.GROUP,"on");
      query.add(GroupParams.GROUP_FIELD,"classifyCode");
      query.add(GroupParams.GROUP_TOTAL_COUNT,"on");  //开启总组数统计
      query.add(GroupParams.GROUP_LIMIT,"2");         //每个组返回的个数限制
      query.add(GroupParams.GROUP_OFFSET,"1");        //每个组的开始文档位置
      //QueryResponse无法获取总记录数和开始记录数
      QueryResponse response = solrClient.query(query);
      //获取分组响应信息
      GroupResponse groupResponse = response.getGroupResponse();
      List<GroupCommand> groupCommands = groupResponse.getValues();
      for (GroupCommand groupCommand : groupCommands) {
        int matchNum = groupCommand.getMatches();
        Integer nGroups = groupCommand.getNGroups();
        System.out.println("总组数:"+nGroups);
        System.out.println("总个数:"+matchNum);
        groupCommand.getValues().forEach(x -> {
          DocumentObjectBinder documentObjectBinder = new DocumentObjectBinder();
          List<Book> books = documentObjectBinder.getBeans(Book.class, x.getResult());
          System.out.println("分组名"+x.getGroupValue());
          System.out.println(books);
        });
      }
    }
    
  • 分面查询

    • 普通字段分片
    @Test
    public void testFacetsQuery() throws Exception {
      SolrClient solrClient = getSolrClient();
      SolrQuery query = new SolrQuery("*:*");
      query.addFacetField("classifyCode");
      QueryResponse response = solrClient.query(query);
      //获取分片信息
      for (FacetField facetField : response.getFacetFields()) {
        System.out.println("分类字段名" + facetField.getName());
        //获取统计信息
        for (FacetField.Count count: facetField.getValues()) {
          System.out.println(count.getName() + ":" + count.getCount());
        }
      }
      solrClient.close();
    }
    
    • 日期分片
    //UTC日期格式字符串转换为Date类型字符串
    static public String UTCStrToDateStr(String utc) {
      Instant ins = Instant.parse(utc);
      Date date = Date.from(ins);
      DateFormat formart = new SimpleDateFormat("yyyy-MM-dd");
      return formart.format(date);
    }
      
    @Test
    public void testDateFacetsQuery() throws Exception {
      SolrClient solrClient = getSolrClient();
      SolrQuery query = new SolrQuery("*:*");
      query.add(FacetParams.FACET_MINCOUNT,"1");   //最小统计数,小于此数不返回
      Date startDate = DateUtil.parse("2013-01-01");
      Date endtDate = DateUtil.parse("2015-01-01");;
      String gap = "+1MONTH";
      query.addDateRangeFacet("pubshingTime", startDate, endtDate, gap);
      QueryResponse response = solrClient.query(query);
      //获取分片信息
      for (RangeFacet<Date,Date> rangeFacet : response.getFacetRanges()) {
        System.out.println("分类字段名" + rangeFacet.getName());
        //获取统计信息
        for (Count count : rangeFacet.getCounts()) {
          System.out.println(DateUtil.UTCStrToDateStr(count.getValue())+" ~ " + 
                             gap + "  count:"+count.getCount());
        }
      }
      solrClient.close();
    }
    
    • 数字分片
    @Test
    public void testNumericFacetsQuery() throws Exception {
      SolrClient solrClient = getSolrClient();
      SolrQuery query = new SolrQuery("*:*");
      query.add(FacetParams.FACET_MINCOUNT,"1");  //最小统计数,小于此数不返回
      Number start = new Double("30.00");
      Number end = new Double("50.00");
      Number gap = new Double("5.00");;
      query.addNumericRangeFacet("price", start, end, gap);
      QueryResponse response = solrClient.query(query);
      //获取分片数据区域列表
      for (RangeFacet<Double,Double> rangeFacet : response.getFacetRanges()) {
        System.out.println("分类字段名" + rangeFacet.getName());
        //获取统计信息
        for (Count count : rangeFacet.getCounts()) {
          System.out.println(count.getValue()+" ~ " + gap + "  count:"+count.getCount());
        }
      }
    

    注:从多个维度进行查询face.pivot

  • 日期范围查询

    //将Date类型转换为UTC字符串形式,因为Solr只认可UTC格式的日期
    static public String dateToUTCStr(Date date) {
      Instant instant = date.toInstant();
      ZoneId zoneId = ZoneId.systemDefault();
      return instant.atZone(zoneId).format(DateTimeFormatter.ISO_INSTANT);
    }
      
    //时间查询
    @Test
    public void testDateRangQuery() throws Exception {
      SolrClient solrClient = getSolrClient();
      String dateStart = DateUtil.dateToUTCStr(DateUtil.parse("2013-09-01"));
      String dateEnd = DateUtil.dateToUTCStr(DateUtil.parse("2013-10-30"));
      SolrQuery query = new SolrQuery("pubshingTime:["+dateStart+" TO "+dateEnd+ "]");
      QueryResponse response = solrClient.query(query);
      List<Book> books = response.getBeans(Book.class);
      long numFound = response.getResults().getNumFound();
      long start = response.getResults().getStart();
      System.out.println("总记录数:"+numFound);
      System.out.println("开始记录:"+start);
      books.forEach(System.out::println);
      solrClient.close();
    }
    
  • suggest查询

    • 使用suggest需要单独配置solrconfig.xml
    <!-- 和requestHandler的components对应 -->
    <searchComponent name="suggest" class="solr.SuggestComponent">
      <lst name="suggester">
        <!-- 名称 -->
        <str name="name">issnSuggester</str>
        <!-- AnalyzingInfixLookupFactory支持上下文过滤
                 AnalyzingLookupFactory不支持上下文过滤    -->
        <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
        <str name="dictionaryImpl">DocumentDictionaryFactory</str>
        <!-- 建议应用的字段 -->
        <str name="field">issn</str>
        <!-- 过滤的上下文字段 -->
        <str name="contextField">classifyCode</str>
        <!-- 是否突出显示 -->
        <str name="highlight">false</str>
        <!-- 权重字段不是必选字段 -->
        <!--<str name="weightField">price</str>-->
        <str name="suggestAnalyzerFieldType">text_general</str>
        <!-- 何时建立索引,实时打开buildOnCommit,也可手动提交参数suggest.build=true -->
        <str name="buildOnStartup">false</str>
        <str name="buildOnCommit">true</str>
        <str name="buildOnOptimize">true</str>
      </lst>
    </searchComponent>
    <requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
      <lst name="defaults">
        <str name="suggest">true</str>
        <!-- 查询建议的个数 -->
        <str name="suggest.count">5</str>
      </lst>
      <arr name="components">
        <str>suggest</str>
      </arr>
    </requestHandler>
    
    • 对应的java代码,注意setRequestHandler为【/suggest】
    @Test
    public void testSuggesterQuery() throws Exception {
      SolrClient solrClient = getSolrClient();
      SolrQuery query = new SolrQuery();
      //请求参数封装
      query.setRequestHandler("/suggest");              //handler
      query.add("suggest","true");                      //必须设置为true
      query.add("suggest.dictionary","issnSuggester");  //词典
      query.add("suggest.q","9");                       //查询参数
      query.add(CommonParams.WT,"json");                //返回格式
      query.add("suggest.cfq","D");                     //根据上下文过滤
      //query.add("suggest.build","true");              //是否立刻建立索引
      QueryResponse response = solrClient.query(query);
      //获得建议的结果
      SuggesterResponse suggesterResponse = response.getSuggesterResponse();
      Map<String, List<String>> suggestedTerms = suggesterResponse.getSuggestedTerms();
      List<String> termList = suggestedTerms.get("issnSuggester");
      System.out.println(termList);
      solrClient.close();
    }
    
  • moreLikeThis

    为了提高性能,判断相似的字段应该已经在 schema.xml 中存储了termVectors

    //【classifyCode ---> termVectors="true"】
    @Test
    public void testMoreLikeThisQuery() throws Exception {
      SolrClient solrClient = getSolrClient();
      SolrQuery query = new SolrQuery("*:*");     //查询,返回的每个结果都会进行相似匹配
      query.setFields("id");                      //返回的field
      query.addMoreLikeThisField("classifyCode"); //判读相似的字段
      query.setMoreLikeThisCount(2);              //返回相似文档信息的个数
      query.setMoreLikeThisMinDocFreq(1);         //最小文档频率,所在文档的个数小于这个值的词将不用于相似判断
      query.setMoreLikeThisMinTermFreq(1);        //最小分词频率,在单个文档中出现频率小于这个值的词将不用于相似判断
      QueryResponse response = solrClient.query(query);
      NamedList<SolrDocumentList> moreLikeThisList = response.getMoreLikeThis();
      for (Entry<String, SolrDocumentList> entry : moreLikeThisList) {
        System.out.println("id:" + entry.getKey());
        SolrDocumentList documentList = entry.getValue();
        System.out.println("相似总个数:" + documentList.getNumFound());
        for (SolrDocument solrDocument : documentList) {
          System.out.println(solrDocument.get("id"));
        }
      }
      solrClient.close();
    }
    

六、从数据库导入数据

第一步:配置solrconfig.xml,添加dateimprt组件

<!-- dateimprt component -->
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
  <lst name="defaults">
    <str name="config">data-config.xml</str>
  </lst>
</requestHandler>

第二步: 创建【data-config.xml】文件

<?xml version="1.0" encoding="UTF-8"?>
<dataConfig>
  <!-- 配置数据源 -->
  <dataSource name="bookSource" type="JdbcDataSource"
              driver="com.mysql.jdbc.Driver"
              url="jdbc:mysql://192.168.1.107:3306/book?useUnicode=true&amp;characterEncoding=utf-8&amp;allowMultiQueries=true&amp;rewriteBatchedStatements=true"
              user="root"
              password="123456" />
  <document>
    <!-- 文档对象,可配置多个 -->
    <entity name="book" dataSource="bookSource" pk="id"
            query="SELECT id,name,issn,authors,price,pubshingTime,hasMarc,classifyCode FROM book ">
      <!-- 段名映射 -->
      <field column='id' name='id' />
      <field column='name' name='name' />
      <field column='issn' name='issn' />
      <field column='authors' name='authors' />
      <field column='price' name='price' />
      <field column='pubshingTime' name='pubshingTime' />
      <field column='hasMarc' name='hasMarc' />
      <field column='classifyCode' name='classifyCode' />
    </entity>
  </document>
</dataConfig>

第三步:在管控台界面执行

【Dataimport】下可以选择Clean清空以前记录再导入Entity实体类型

七、主从搭建

Master配置

<requestHandler name="/replication" class="solr.ReplicationHandler" >
  <lst name="master">
    <!-- 何时触发复制 -->
    <str name="replicateAfter">commit</str>
    <str name="replicateAfter">startup</str>
    <!-- 要同步到slave的文件以,隔开[一定不能配 solrconfig.xml] -->
    <str name="confFiles">managed-schema</str>               
  </lst>
</requestHandler>

Slave配置

<requestHandler name="/replication" class="solr.ReplicationHandler" >
  <lst name="slave">
    <!-- 主机位置,及其示例 -->
    <str name="masterUrl">http://127.0.0.1:8080/solr/book</str>
    <!-- 多少时间之后同步一次,设置为60秒 -->
    <str name="pollInterval">00:00:60</str>
    <!--  压缩机制,来传输索引, 可选internal|external, internal内网, external外网 -->
    <str name="compression">internal</str>
    <str name="httpConnTimeout">50000</str>
    <str name="httpReadTimeout">500000</str>
  </lst>
</requestHandler>

八、集群搭建

第一步:配置zookeeper集群,并启动【修改zoo.cfg,注意修改dataDir、端口、集群机器位置(使用ip)】

syncLimit=5
dataDir=/opt/zookeeper_dir/zookeeper01/data
clientPort=2181
server.1=192.168.1.155:2888:3888
server.2=192.168.1.155:2889:3889
server.3=192.168.1.155:2890:3890

第二步:修改每个solr-tomcat的web.xml【修改solr_home】

<env-entry>
  <env-entry-name>solr/home</env-entry-name>
  <env-entry-value>/opt/solr_home_dir/solr_cluster_01</env-entry-value>
  <env-entry-type>java.lang.String</env-entry-type>
</env-entry>

第三步:修改tomcat的catalina.sh

JAVA_OPTS="${JAVA_OPTS} -Dsolr.log.dir=/opt/solr_home_dir/solr_cluster_01/log  "
JAVA_OPTS="${JAVA_OPTS} -DzkHost=192.168.1.155:2181,192.168.1.155:2182,192.168.1.155:2183 "

第四步:修改每个solr_home下的solr.xml【端口号与tomcat一致】

<solr>
  <solrcloud> 
    <!--......-->
    <str name="host">${host:192.168.1.155}</str>
    <int name="hostPort">${jetty.port:8080}</int>
    <!--......-->
  </solrcloud>
</solr>

第五步:上传solr配置文件到zookeeper【solr-7.5.0/server/scripts/cloud-scripts目录下】

./zkcli.sh -zkhost [ip:prot,ip:prot...] -cmd upconfig -confdir /opt/solr_home_dir/cluster_conf/conf -confname book

第六步:启动solr-tomcat,访问任意一个tomcat,在Collections页面中添加core

numShards:集群总分片数                    -- 主节点数量
replicationFactor:每个节点有多少个从节点    -- 每个主节点从节点数量
maxShardsPerNode:单个服务器最大节点数      --每台服务器最大节点数

注:solrj连接SolrCloud代码

static SolrClient getSolrClient(){
  final List<String> zkServers = new ArrayList<String>();
  zkServers.add("192.168.1.155:2181"); 
  zkServers.add("192.168.1.155:2182"); 
  zkServers.add("192.168.1.155:2183");
  //第二个参数为上传到zookeeper的solr配置目录
  CloudSolrClient cloudSolrClient = new CloudSolrClient.Builder(zkServers, Optional.of("/")).build();
  //默认的Collection
  cloudSolrClient.setDefaultCollection("book");
  return cloudSolrClient;
}