Strus - Web Service

Created by Andreas Baumann

http://www.andreasbaumann.cc/slide/struswebservice

(C)2015/2016

Outline

  • Have a web service, language independent, for search.
  • The software to the vision exists, it's called Elasticsearch.
  • Based on Lucene, so we do the same with Strus
  • Currently there is a Java API which uses this service

Strus web service

Design rationales

  • standard technology (HTTP, JSON)
  • CRUD (all basic functions exists for all classes)
  • proxyable and securable
  • zero-configuration
  • scalable
  • C++
  • MPL v2
  • runs on Linux, OSX, FreeBSD

Protocol

  • Service functions
  • Index functions
  • Document functions
  • Query functions
  • Transactions
  • Statistics
  • Introspection and Configuration

Protocol

  • Manipulate indexes:
    
    curl -XPOST -H 'Content-Type: application/json'
    http://localhost:8080/strus/index/create/A
    '-d { "params" : { } }'
    
    {"result":"ok"}
    
    curl -XPOST http://localhost:8080/strus/index/delete/A
    {"result":"ok"}
    
    curl -XPOST http://localhost:8080/strus/index/exists/A
    {"exists":false,"result":"ok"}
    
                                

Protocol

  • Insert document:
    
    curl -XPOST -H 'Content-Type: application/json'
    http://localhost:8080/strus/document/insert/A
    -d
    { "doc" : { "docid" : "doc3",
        "attributes" : [ 
        { "key" : "title", "value" : "This is a Hello World Document" },
        { "key" : "attr1", "value" : "val1" },
        { "key" : "attr2", "value" : "val2" } ],
        "metadata" : [ 
            { "key" : "doclen", "value" : 23773 }, { "key" : "docweight", "value" : 3.1415 } ], 
        "forward" : [ 
            { "type" : "word", "value" : "Hello", "pos": 1 },
            { "type" : "word", "value" : "World", "pos" : 2 } ],
        "search" : [
            { "type" : "Word", "value" : "hello", "pos": 1 }, 
            { "type" : "word", "value" : "world", "pos" : 2 } ] } 
    }
                                

Protocol

  • Query:
    
    curl -XPOST -H 'Content-Type: application/json'
    http://localhost:8080/strus/query/A
    -d
    { "query": {
            "first_rank": 0,
            "nof_ranks": 20,
            "weighting": {
                "scheme": {
                    "name": "bm25",
                    "params": [
                        {
                            "key": "b",
                            "value": 0.75
                        },
                        {
                            "key": "k1",
                            "value": 1.0001
                        },
                        {
                            "key": "avgdoclen",
                            "value": 11943
                        }
                    ]
                }
            },
            "summarizer": [
                {
                    "attribute": "attribute",
                    "name": "attribute",
                    "params": [
                        {
                            "key": "name",
                            "value": "docid"
                        }
                    ]
                }
            ],
            "features": [
                {
            "name": "feat",
                    "value": {
                            "term": {
                                    "type": "word",
                                    "value": "hello"
                                    }
                             },
                             "weight": 1
                },
                {
                    "name": "sel",
                    "value": {
                             "expression": {
                                    "operator": "union",
                                            "range": 0,
                                            "cardinality": 0,
                                            "terms": [
                                                    {
                                                            "term": {
                                                                    "type": "word",
                                                                    "value": "hello"
                                                             }
                                                    }
                                            ]
                             }
                    },
                    "weight": 1
                }
            ],
            "select": [
                "sel"
            ]
    }
    
    {
      "execution_time": 0.000983791,
      "ranklist": {
        "documents_ranked": 1,
        "documents_visited": 1,
        "passes_evaluated": 0,
        "ranks": [
          {
            "attributes": [
              {
                "key": "docid",
                "value": "doc3"
              }
            ],
            "docno": 1,
            "weight": 0
          }
        ]
      },
      "result": "ok"
    }
                                

Protocol

  • Transactions:
    
    curl -XPOST -H 'Content-Type: application/json'
    http://localhost:8080/strus/transaction/begin/A/T1
    
    curl -XPOST -H 'Content-Type: application/json'
    http://localhost:8080/strus/document/insert/A
    -d
    { "transaction" : { "id" : "T1" },
      "doc" : { "docid" : "doc1",
      ...
    }
        
    curl -XPOST -H 'Content-Type: application/json'
    http://localhost:8080/strus/transaction/commit/A/T1
                                

Protocol

  • Introspection (document):
    
    curl -XPOST -H 'Content-Type: application/json'
    http://localhost:8080/strus/document/get/A/doc3
    
    {
      "doc": {
        "attributes": [
          {
            "key": "attr1",
            "value": "val1"
          },
          {
            "key": "attr2",
            "value": "val2"
          },
          {
            "key": "docid",
            "value": "doc3"
          },
          {
            "key": "title",
            "value": "This is a Hello World Document"
          }
        ],
        "docno": 1,
        "forward": [
          {
            "pos": 1,
            "type": "word",
            "value": "Hello"
          },
          {
            "pos": 2,
            "type": "word",
            "value": "World"
          }
        ],
        "metadata": [
          {
            "key": "docweight",
            "value": 3.141499996185303
          },
            {
            "key": "doclen",
            "value": 23773
          },
          {
            "key": "date",
            "value": 0
          }
        ],
        "search": [
          {
            "pos": 1,
            "type": "word",
            "value": "hello"
          },
          {
            "pos": 2,
            "type": "word",
            "value": "world"
          }
        ]
       },
      "execution_time": 0.00028157,
      "result": "ok"
    }   
                                

Protocol

  • Statistics and introspection (index):
    
    curl http://localhost:8080/strus/index/stats/A
    
    {"result":"ok","stats":{"nof_docs":1}}
    
    curl http://localhost:8080/strus/index/config/A
    
    {"config":{
        "attributes":["attr1","attr2","docid","title"],
        "metadata":[{"name":"date","type":"UInt16"},{"name":"doclen","type":"UInt16"},{"name":"docweight","type":"Float32"}],
        "types":["word"]},
    "result":"ok"}
                                

Protocol

  • Statistics and introspection (system):
    
    curl http://localhost:8080/strus/config
    {"config":{
        "posting_join_operators":[
            "chain","chain_struct","contains","diff","inrange","inrange_struct","intersect","pred","sequence","sequence_struct","succ","union","within","within_struct"],
        "summarizer_functions":[
            "accuvariable","attribute","matchphrase","matchpos","matchvariables","metadata"],
        "weighting_funtions":[
            "bm25","formula","metadata","td","tf"]},
    "result":"ok"}
    
       ... 
       "weighting_functions": [
          {
            "description": "Calculate the document weight with the weighting scheme \"BM25\"",
            "name": "bm25",
            "parameter": [
              {
                "description": "defines the query features to weight",
                "name": "match",
                "type": "feature"
              },
              {
                "description": "parameter of the BM25 weighting scheme",
                "name": "k1",
                "type": "numeric"
              },
        ...
                                

Strus web service

Links

Java API

  • Uses latest Java 1.8 (aka requires Java 1.8)
  • Uses Jackson for POJO serialization from/to JSON
  • uses javax.ws.rs.client web service classes for HTTP

Java API

Links

Todos and future

  • support all functions from strus API
  • integrate strusAnalyzer and strusStream APIs
  • start with simple cluster mode (distribution proxy service)