Thursday, December 20, 2018

How to Set Up and Configure a Simple Alfresco Cluster

This article explains how to set up and configure a simple Alfresco cluster. This cluster will have these elements:


  • Alfresco installation on two nodes (each one will have its own Solr index)
  • Database install on one node
  • Contentstore being shared from one node
  • Proxy load balancer (using Apache)


Architecture

This can be spread out on to many different server nodes if you wanted to. But, we'll show you how to do this on 3 servers:


  • Alfresco install on alf1 node
  • Alfresco install on alf2 node
  • Database installation and shared NFS drive on db1 node


It is possible to do this on 2 servers (even 1 if you really wanted to) and have the database installed on one of the Alfresco servers but it's best for learning purposes to do it with 3 if you can. If you have enough resources locally, you can do this with VirtualBox or VMWare. For this article, I'll do it using AWS.

Logically speaking, an Alfresco cluster will have at least two Alfresco installations. Both installations will use a single database and will share a similarly configured connection. For my example, I am using MySQL. Both Alfresco installs will also use a single contentstore where the files are shared. But, each Solr will have to have its own indexes. Solr has index files that only pertain to its own local Solr install. Solr doesn't have the ability to use index files that could be shared among Solr installations.

I have built 3 virtual servers which are using Ubuntu 16.04 as OS:


  • alf1 - 2 cpu's with 8GB of memory (hard drive space doesn't need to be more than the default 8GB that you get with an AWS instance)
  • alf2 - 2 cpu's with 8GB of memory (same hard drive specs)
  • db1 - 1 cpu with 4GB of memory (same hard drive specs)


Prepare Servers

Once you've built these three server, log in on each one and run updates and upgrades to get each system up to speed:

  # sudo apt update && sudo apt upgrade

For the Alfresco servers make sure these ports are open in both AWS security group and in each server's firewall:


  • alf1 and alf2: 8080, 8443 and 80
  • db1: 3306, 2049 and 111


On alf1 and alf2 run this command to install the prerequisites for the Alfresco installation:

       # sudo apt install libfontconfig1 libfontconfig1-dev libice-dev libice6 libsm-dev libsm6 libxrender-dev libxrender1 libxext-dev libxext6 libxinerama-dev libxinerama1 libcups2 libcups2-dev libglu1-mesa libglu1-mesa-dev libcairo2 libcairo2-dev

Install MySQL

On db1 install MySQL server:

# sudo apt install mysql-server

The MySQL server install should give you the opportunity to set the root password. For this article, I have set the password with 'Alfr3sc0'.

Open the MySQL client:

# mysql -u root -pAlfr3sc0

In the MySQL client, issue these commands to create the Alfresco database and set the proper permissions:

mysql> create database alfresco;

   mysql> grant all privileges on alfresco.* to 'alfresco'@'alfresco' identified by 'Alfr3sc0';
Query OK, 0 rows affected, 1 warning (0.01 sec)

   mysql> grant all privileges on alfresco.* to 'alfresco'@'172.30.1.197' identified by 'Alfr3sc0';
Query OK, 0 rows affected, 1 warning (0.01 sec)

   mysql> grant all privileges on alfresco.* to 'alfresco'@'172.30.1.78' identified by 'Alfr3sc0';
Query OK, 0 rows affected, 1 warning (0.00 sec)

Install NFS

On db1 install the NFS server and set up the contentstore share:

   # sudo apt install nfs-kernel-server

Create the Alfresco contentstore shared folder:

# mkdir -p alfresco/contentstore

Add the following endpoints for each Alfresco server.

        # vi /etc/exports

    /alfresco/contentstore    172.30.1.78(rw,sync,no_root_squash,subtree_check)
/alfresco/contentstore    172.30.1.197(rw,sync,no_root_squash,subtree_check)

Restart the NFS Server so that these configurations are in effect:

    # service nfs-server restart

Issue the exportfs command to ensure that the correct shared folders are in effect:

    # sudo exportfs -ra


On both alf1 and alf2 install the NFS clients:

    # apt install nfs-common

Create the Alfresco Contentstore mount point folder:

    # mkdir /mnt/alfresco-contentstore
   
In the /etc/fstab file create a pointer to the

    # vi /etc/fstab

  172.30.1.198:/alfresco/contentstore  /mnt/alfresco-contentstore nfs rw,soft,intr,noatime,x-gvfs-show

Mount the mount points:

    # mount -a

Check the output of the df command to ensure the mount points are active:

    # df -kh

Test the mount points and make sure that your user can create a file there:

    # cd /mnt/alfresco-contentstore/
  # touch test
 

Install Alfresco

On alf1 and alf2 install Alfresco.

For this article, I have used Alfresco 5.2.4 but this should work with any recent version of either Alfresco Enterprise or Community:

# ./alfresco-installer.bin --mode text

Set the following configurations (make sure the ip addresses shown reflect the database server ip address in your environment):

JDBC URL: [jdbc:postgresql://localhost/alfresco]: jdbc:mysql://172.30.1.198/alfresco?useUnicode=yes&characterEncoding=UTF-8

JDBC Driver: [org.postgresql.Driver]: org.gjm.mm.mysql.Driver

Database name: [alfresco]: alfresco

Install the MySQL jdbc jar to your tomcat/lib directories:

# wget http://central.maven.org/maven2/mysql/mysql-connector-java/5.1.42/mysql-connector-java-5.1.42.jar

In alfresco-global.properties (in the tomcat/shared/classes folder) set the dir.contentstore to use the mounted Alfresco Contentstore drive:

dir.contentstore=/mnt/alfresco-contentstore

Start up alf1 and make sure it works as expected. To test, you should create a Share site, add a document and be able to successfully search for it. Follow these same steps for alf2 and test to see if you can find the same file there too -- in the expected folder and in search as well. With alf1 you can go ahead and add your license. The license info will be added to the database. Since both Alfresco installs are referencing the same database, the license info will be sufficient for both alf1 and alf2.

At a minimum, you should be able to log in to each Alfresco server. But, you will notice that if you log in on both alf1 and alf2 and then log out of alf1, you will still be logged in to alf2. This is because we are using multiple sessions for the same user which is not ultimately what we want. Go ahead and turn off both alf1 and alf2 Alfresco services. In the next section, we'll configure Hazelcast to enable session replication.


Share Cluster (session replication)

To configure Hazelcast for session replication you only need to remove the .sample from custom-slingshot-application-context.xml.sample file in tomcat/shared/classes/alfresco/web-extension folder so that it becomes custom-slingshot-application-context.xml.

Inside this file, change the 192.168.0.* to your internal ip address of your Alfresco server. After you do this for both Alfresco nodes, go ahead and restart both of them one at a time. Once you've done that, you can then log in to both Alfresco nodes.

Then, log out of alf1 and do a refresh of alf2. On alf2 you should be redirected to the login page. This demonstrates that you are using the same session on both nodes.

For this to be a true application cluster, we need to use only one hostname to access Alfresco. This hostname will then redirect to either Alfresco node based on a load-balancer's configuration. For the hostname we'll use alfrescodemo.com. In your workstation add the following to your hosts file (for demonstration purposes):

192.168.56.101  alfrescodemo.com

The ip address above should point to your database server (this is where we'll install Apache). We'll use Apache to handle this from the database server.


Apache Proxy/Load Balancing

On the db1 node install Apache 2.4 and mod_jk:

# sudo apt install apache2 libapache2-mod-jk

Open the default host configuration 000-default.conf in /etc/apache2/sites-available and add the following:

<VirtualHost *:80>

        ServerName alfresco.alfdemo.com
        ProxyRequests Off
        ProxyPassReverse /share balancer://app
        ProxyPass /share balancer://app stickysession=JSESSIONID|jsessionid nofailover=On
        <Proxy balancer://app>
                BalancerMember ajp://172.30.1.197:8009/share route=tomcat1
                BalancerMember ajp://172.30.1.78:8009/share route=tomcat2
        </Proxy>

</VirtualHost>

Save the file. In /etc/libapache2-mod-jk folder back up the workers.properties file and create a new one. In the new workers.properties file add the following (change the ip addresses to the ones you are using for alf1 and alf2 in your environment):

worker.list=loadbalancer

   worker.tomcat1.port=8009
   worker.tomcat1.host=172.30.1.197
   worker.tomcat1.type=ajp13
   worker.tomcat1.lbfactor=1

   worker.tomcat2.port=8009
   worker.tomcat2.host=172.30.1.197
   worker.tomcat2.type=ajp13
   worker.tomcat2.lbfactor=1

   worker.loadbalancer.type=lb
   worker.loadbalancer.balance_workers=tomcat1,tomcat2
   worker.loadbalancer.sticky_session=1

   worker.tomcat1.socket_keepalive=1
   worker.tomcat2.socket_keepalive=2
   worker.loadbalancer.method=B

Save this file and restart Apache by issuing:

# service apache2 restart

On the alf1 and alf2 servers, make the following changes to the server.xml (in tomcat/conf):

<Connector port="80" URIEncoding="UTF-8" protocol="HTTP/1.1"
               connectionTimeout="20000"
               redirectPort="8443" maxHttpHeaderSize="32768" />

and change the following in the existing Engine stanza:

<Engine name="Catalina" defaultHost="localhost" jvmRoute="tomcat1">

Use jvmRoute="tomcat1" for alf1 and jvmRoute="tomcat2" for alf2.

In alfresco-global.properties make these changes:

       alfresco.host=alfrescodemo.com
   alfresco.port=80
   share.host=alfrescodemo.com
   share.port=80

In share-custom-config.xml (in tomcat/shared/classes/web-extension folder) change all mentions of:

localhost:8080

to

localhost

Alfresco Cluster

Now, when you restart Alfresco you should be able to access http://alfrescodemo.com/share. This will take you to the Apache server installed on the database server. The loadbalancer functionality within Apache will then route you to either alf1 or alf2. At this point, you should have a functioning Alfresco cluster. This was a demonstration but you can see how simple this is to configure in a production environment. The best way to implement this is to go a step at a time and make sure that each component works as expected before adding the next components that will eventually make this is a fully functioning cluster.