Thursday, June 22, 2017

Elasticsearch: Extending ES using AOP

Elasticsearch provides the ability for extending the basic functionality using scripts or plug-ins. But at the same time ES has restrictions on changing or extending existing functionality like search actions etc. For example, say, I need to change the result set for every query or change the input search criteria etc. In such a case, we can use AOP to extend ES functionality.
AOP is a vast topic in itself so I will not cover it here. Basically, it allows to control what happens before a method execution, after a method execution, change the original input parameters, change the return values etc.
For this post we will try to monitor the ES search parameters using AOP. We will use AspectJ programming using maven project as follows:
  • Create the maven project
  • Find out the ES methods to be monitored
  • Define the Pointcuts & Advices
  • Compile the jar
  • Use Load Time Weaving
  • Start ES and monitor the queries

Source code
The source code for this example is located here.

Create the maven project
Create a simple  maven jar project with following POM xml.
  • Add dependencies for aspectjrt line 13-17 and aspectjweaver line 18-22
  • Add aspectj-maven-plugin line 31-48

ES methods to monitor
We will monitor the searches by ES. The actual search execution is done by the SearchService. We will monitor the executeSearch() and executeFetch() methods on this class.

Define Pointcuts & Advices
The SearchServiceAspect class defines the Pointcuts & Advices using annotations. Following is the definition to monitor the SearchService.executeSearch() method:
  • Line 5 defines the Around advice.
  • The advice is for ES method SearchService.executeQueryPhase() which takes two parameters: ShardSearchTransportRequest & SearchTask
  • It also mentions that these two should be passed to our method which will handle the execution.
  • Line 6 is the method in our class which will be executed when SearchService.executeQueryPhase() is to be executed.
  • In this method we can decide what to do i.e. do some processing or continue execution using joinPoint.proceeed() etc.
  • For this example, we are just logging the search query information.

aop.xml
Next we define the aop.xml which acts as input to the AOP compiler.
  • Line 4 declares that our aspect pointcuts & advices are in class my.elasticsearch.aspects.SearchServiceAspect
  • Line 6 tells the weaver to be more verbose to help troubleshooting the issues if any.
  • Line 7 tells the weaver which class in the target application is to be woven.
  • The aop.xml needs to be present at src/main/resources/META-INF in the project.

Deploying the jars
  • Compile the project and generate the jar using command: mvn install
  • Copy the generated es5x-method-interceptor-0.0.1-SNAPSHOT.jar into the elasticsearch/lib folder as per the ES installation.

Configure ES startup
Now, we need to start ES using the aspect weaver as java agent. 
  • Copy the aspectjweaver-1.8.9.jar into elasticsearch/bin folder.
  • Open the bin/elasticsearch.bat or bin/elasticsearch.sh file (I have verified with Windows batch file)
  • Add following line before the java command execution to launch the ES
             SET ES_JAVA_OPTS=%ES_JAVA_OPTS% -Djava.security.policy=enable_aspectj_classes.policy -javaagent:aspectjweaver-1.8.9.jar
  • With above change we are basically passing two additional command line parameters while start the ES:
    • javaagent: Using the aspectj weaver as java agent.
    • java.security.policy: ES 5x uses java security manager and hence our aspect weaver won't work. Using the above parameter we are asking the java security manager to grant the required permissions. The content of enable_aspectj_classes.policy is as follows:
                    grant {
                                      permission java.security.AllPermission;
                                    };
          • We are granting all permissions to the application. It is recommended that you find out granular permissions by enabling the logging at access level and update the above file as required.
          • The file is placed in elasticsearch/bin folder.
      The logs from weaver won't be included in the ES logs. So you can redirect the ES command line output (if not running as service/daemon) to a file as follows:
      elasticsearch.bat > c:\es5x.log 2>&1
      You should see following logs:
      Line 18 & 19 indicates that out pointcuts & advices have been woven.

      Populating test data
      • We will create two indices as aop_test_01 & aop_test_02 having file type with following data:
              {"name": "file_01", "size": 5678}
      • Populate as many entries as you want.
      • Make sure both indices have some data

      Querying the data
      Lets issue a simple search snapping both the indices as:
      http://localhost:9200/aop_test_*/_search
      You should see that our advice is getting executed from the following logs:
      • Since we are querying multiple indices resulting in multiple shards query we see both executeQueryPhase() & executeFetchPhase() advice getting called.
      • In executeQueryPhase() we are just logging the indices. We can log additional info like the query to be executed etc.
      • In the executeFetchPhase() we are also logging the docIds which the search returns.
      • You can read more on how search gets executed here.

      Summary
      With this simple POC we have hooked our custom code into ES which will allow use to customize ES behavior as needed. Currently, we have just logged the input requests but can do more like change the input/output etc.


      2 comments: