Alfresco

Alfresco is an open source Enterprise Content Management (ECM) system that manages all the content within an enterprise and provides the services and controls that manage this content.

At the core of the Alfresco system is a repository supported by a server that persists content, metadata, associations, and full text indexes. Programming interfaces support multiple languages and protocols upon which developers can create custom applications and solutions. Out-of-the-box applications provide standard solutions such as document management, and web content management.

As an entirely Java application, the Alfresco system runs on virtually any system that can run Java Enterprise Edition. At the core is the Spring platform, providing the ability to modularize functionality, such as versioning, security, and rules. Alfresco uses scripting to simplify adding new functionality and developing new programming interfaces. This portion of the architecture is known as web scripts and can be used for both data and presentation services. The lightweight architecture is easy to download, install, and deploy.

Alfresco Use Cases

Alfresco can be used for building most ECM applications. Aside from the major applications such as document, image, records, digital asset, and web content management, there are a number of specific applications and use cases that add value to the enterprise.

The following are typical ECM applications that can use Alfresco applications as the foundation. Applying programming models lets you extend these applications or you can build your own applications using Alfresco.

  • Document management manages and shares office documents, and incorporates business processes. It can be industry or role-specific. The Alfresco Share is a good foundation for building document management applications.
  • Records management controls important information for retention over time. You would use records management over document management in regulated or compliant environments, such as in managing governmental information or personnel records, or where information is audited. Alfresco is certified to the U.S. Government 5015.2 records standard and is useful for controlling retention and review periods, providing specialized security, and determining whether the records are archived or destroyed after a specified period of time.
  • Shared drive replacement is a more basic form of document management in the enterprise with a content repository that provides easy access points to content. Shared drives are simple to use because users do not need to be trained and all applications work with them. Because Alfresco supports the protocol used by shared drives, Common Internet File System (CIFS), the repository appears to be a shared drive. With rules, actions, and extensions, you can build complete document management applications that are transparent to users while getting the content under control and enabling it to be searched.
  • Enterprise portals and intranets communicate with employees with news and developments in the enterprise. While part of enterprise portals focus on reporting and analyzing data, many are devoted to content and documents. Although folder hierarchies are an easy way to organize information for a portal, classifications and metadata are often a better way to target information in the portal to end users. Thus, there are elements of document management and business process for delivering into the portal; however, the presentation of lists of content and navigation through classifications require programming portlets using web scripts or Java. Portlets provided as part of the Alfresco platform can supplement this development with standardized navigation, search, and content presentation.
  • Web content management manages websites, the content that goes into websites (such as HTML and images), and the processes of building, testing, and deploying websites and content. While Alfresco can be used for simple websites, it is frequently used for creating websites that are web applications, particularly those developed using Java. Some examples of these websites publish a lot of information from multiple sources and integrate e-commerce and back office systems. Surf is a good platform for creating these types of web applications and websites.
  • Knowledge management captures knowledge from employees or customers and provides it in a form that others can use. Content tends to be the best and most reusable container of knowledge in sharing that knowledge with others.
  • Information publishing encompasses real-time publishing of content from different sources to the website and the deployment of that content to the web farm for Internet access. This can be digital assets such as articles, written internally or syndicated from other sources, or photos. Media companies use Alfresco to combine this content and publish it to their websites. This straight-through publishing of information requires both strong content control and performance to aggregate and push out the content.
  • Case management handles information related to a case, such as an insurance claim, an investigation, or personnel processing. Alfresco’s document management capabilities, folder structure, classification schemes, and workflow is well suited to managing cases and distributing work in handling cases. Alfresco incorporates the Activiti workflow engine and can handle sophisticated workflows and queue management. Alfresco has a content-oriented task model that aggregates all the resources required to perform specific tasks within the case handling process.

 

Alfresco API Guide

Alfresco supports a range of APIs (Application Programming Interfaces) to enable developers to write applications that access the Alfresco content repository, both on-premise and cloud. The Public API is comprised of CMIS for doing things like performing CRUD functions on documents and folders, querying for content, modifying ACLs, and creating relationships plus non-CMIS calls for things that CMIS doesn’t cover. When you run against Alfresco in the cloud you use OAuth2 for authentication. When you run against Alfresco on-premise, you use basic authentication.

  • Get a list of sites a user can see
  • Get some information about a site, including its members
  • Add, update, or remove someone from a site
  • Get information about a person, including their favorite sites and preferences
  • Get the tags for a node, including updating a tag on a node
  • Create, update, and delete comments on a node
  • Like/unlike a node
  • Do everything that CMIS can do

Through Alfresco One API that comes with the Enterprise edition the CMIS protocol is provided with an additional set of REST endpoints, and secured it all with OAuth2 and SSL/TLS.

  • Use CMIS to store, retrieve, search and query folders, access metadata and thumbnails for documents, and read and write file content.
  • Use REST to access users, cloud networks, sites, the activity stream, and to read and write content tags and comments.

 

Access protocols

Alfresco supports a number of different protocols for accessing the content repository. Their availability extends the options available to developers, when building their own applications and extensions.

Protocols provide developers with another possible avenue for building their own applications and extensions. For example, if you are building a client application to connect with multiple repositories from multiple vendors, including Alfresco, then CMIS is a consideration. If you are building a client to connect via the SharePoint Protocol, then use the Alfresco Office Services (AOS). Protocols provide a resource for developers, in addition to the numerous other extension points and APIs built into Alfresco.

When any of these protocols are used to access or upload content to the Alfresco repository, access control is always enforced based on configured permissions, regardless of what protocol that is used.

The following table list some of the main protocols supported by Alfresco and links to more detailed documentation.

Protocol Description Support Status
HTTP The main protocol used to access Alfresco content repository via for example the Alfresco REST APIs. Standard in Enterprise and Community.
WebDAV Web-based Distributed Authoring and Versioning is a set of HTTP extensions that lets you manage files collaboratively on web servers. Standard in Enterprise and Community.
FTP File Transfer Protocol – standard network protocol for file upload, download and manipulation. Useful for bulk uploads and downloads. Standard in Enterprise and Community.
CIFS Common Internet File System – allows the projection of Alfresco as a native shared drive. Any client that can read or write to file drives can read and write to Alfresco, allowing the commonly used shared file drive to be replaced with an ECM system, without users knowing. Standard in Enterprise and Community.
Alfresco Office Services Alfresco Office Services (AOS) allow you to access Alfresco directly from all your Microsoft Office applications. Enterprise only.
CMIS Alfresco fully implements both the CMIS 1.0 and 1.1 standards to allow your application to manage content and metadata in an on-premise Alfresco repository or Alfresco in the cloud. Standard in Enterprise and Community.
IMAP Internet Message Access Protocol – allows access to email on a remote server. Alfresco can present itself as an email server, allowing clients such as Microsoft Outlook, Thunderbird, Apple Mail and other email clients to access the content repository, and manipulate folders and files contained there. Standard in Enterprise and Community.
SMTP It is possible to email content into the repository (InboundSMTP). A folder can be dedicated as an email target. Standard in Enterprise and Community.

 

Alfresco programming models

A number of programming models are available for building an application using the Alfresco content application server.
  • The simplest model for non-programmers is to use out-of-the-box components of the Alfresco Share application and the Rules and Actions model, a set of conditions and actions to take on content based on those conditions. You can define rules and actions using a wizard and perform actions such as converting content, moving content, or executing a simple JavaScript snippet.
  • Web scripts let you perform more sophisticated processing without complex programming. The Alfresco Content Management Interoperability Services (CMIS) implementation was built using web scripts. By using JavaScript to build these data services, it is easy to create new services in Alfresco. To build new user interfaces or extensions to Alfresco Share, you can also use web scripts by using a web templating language like FreeMarker. Most of Alfresco Share was built using web scripts.
  • To use Java to build applications or extend Alfresco Share, you can use the many tools associated with Java that were used to build the Alfresco system. Surf, the web runtime framework, lets you extend Alfresco Share and build web applications. Because Alfresco Share was built using Surf, you can build your own extensions as a combination of Java programming and web scripts, or with Java alone. You can also use Java to access or even replace whole pieces of Alfresco, content application server, or Alfresco Share by using the Spring platform. You can use the source code as an example for rewriting pieces and using Spring beans and configuration to extend or replace functionality in Alfresco.
  • To write applications that use Alfresco but are portable to other ECM systems, you can use Content Management Interoperability Services (CMIS), the OASIS standard for accessing content repositories.

 

Getting Started

  1. Download the installer from the Alfresco download website.
  2. Run the setup wizard for Microsoft Windows which installs all the software and components that you require for running Alfresco. This setup wizard installs Alfresco and additional software, including a Tomcat application server, PostgreSQL database, JDK, and LibreOffice.

 

Installation on Tomcat

Install an instance of Tomcat 7 manually and modify it to use the correct directory structure and files for Alfresco.
These instructions recommend that you name the required directories as shared/classes and shared/lib because these are the path names used within full Alfresco installations. You can substitute alternative names for these directories. The installation directory for Tomcat is represented as <TOMCAT_HOME>.
  1. Download and install Tomcat version 7 following the instructions from http://tomcat.apache.org.
  2. Create the directories required for an Alfresco installation:
    1. Create the shared/classes directory.
    2. Create the shared/lib directory.
  3. Open the <TOMCAT_HOME>/conf/catalina.properties file.
  4. Change the value of the shared.loader= property to the following:

    shared.loader=${catalina.base}/shared/classes

    Note:If you have used alternative names for the directories, you must specify these names in the shared.loader property.
  5. Copy the JDBC drivers for the database you are using to the lib/ directory.
  6. Edit the <TOMCAT_HOME>/conf/server.xml file.
  7. Set attributes of HTTP connectors.

    Tomcat uses ISO-8859-1 character encoding when decoding URLs that are received from a browser. This can cause problems when creating, uploading, and renaming files with international characters.

    By default, Tomcat uses an 8 KB header buffer size, which might not be large enough for Kerberos and NTLM authentication protocols.

    Locate the Connector sections, and then add the URIEncoding=”UTF-8″ and maxHttpHeaderSize=”32768″ properties.

    <Connector port="8080" protocol="HTTP/1.1" URIEncoding="UTF-8" connectionTimeout="20000" redirectPort="8443" maxHttpHeaderSize="32768"/>
  8. Save the server.xml file.

 

When using Internet Explorer versions 7 and 8, if you try to download a document from Alfresco Share running in Tomcat with https (SSL) enabled, you might see an error message. To resolve this issue, add the following line to the context element in the<TOMCAT_HOME>/conf/context.xml file:

<Valve className="org.apache.catalina.authenticator.SSLAuthenticator" securePagesWithPragma="false" />

A WAR file is a JAR file used to distribute a collection of files (JavaServer Pages, servlets, Java classes, XML files, tag libraries, and static web pages) that together constitute a web application.

Use this method of installing if you already have installed a JDK, a supported database, an application server, and the additional Alfresco components.
The Alfresco Community Edition Distribution file is a zip containing the required WAR files, in addition to the additional commands, and configuration files for a manual installation.
  1. Download the following file:

    alfresco-community-distribution-201512-EA.zip

  2. Specify a location for the download and extract the file.

    You see the following directory structure:

    alf_data
    amps
    amps_share
    bin
    common
    java
    licenses
    modules
    postgresql
    scripts
    solr4
    tomcat

    The Distribution zip also contains the following file:

    README.txt

    The /alf_data directory contains the following directories:

    contentstore

    This directory and subdirectories contain contentstore .bin files.

    contentstore.deleted

    This directory contains any deleted contentstore files.

    keystore

    This directory contains the following files.

    File name Description
    browser.p12 The pkcs12 keystore generated from ssl.keystore that contains the repository private key and certificate for use in browsers, such as Firefox.
    CreateSSLKeystores.txt Contains instructions to create an RSA public/private key pair for the repository with a certificate that has been signed by the Alfresco Certificate Authority (CA).
    generate_keystores.bat Windows batch file for generating secure keys for Solr communication.
    generate_keystores.sh Linux script file for generating secure keys for Solr communication.
    keystore Secret key keystore containing the secret key used to encrypt and decrypt node properties.
    keystore-passwords.properties Contains password protecting the keystore entries.
    readme.txt Text file containing information about other files in a directory.
    ssl-keystore-passwords.properties Contains passwords for SSL keystore.
    ssl-truststore-passwords.properties Contains passwords for SSL truststore.
    ssl.keystore Repository keystore containing the repository private/public key pair and certificate.
    ssl.truststore Repository truststore containing certificates that the repository trusts.
    oouser

    This directory contains files and folders relating to Open Office.

    postgresql

    This directory contains files and folders relating to PostgreSQL.

    solr4

    This directory contains files and folders relating to Solr4.

    The /solr4 directory contains the following folders:

    File name Description
    /content This directory contains a compressed copy of all the Solr documents added to the index.
    /index This directory contains all the indexes of the archive and workspace cores.
    /model This directory contains all the models.

    For more information, see Solr directory structure.

    The /amps directory contains the following files:

    File name Description
    alfresco-googledocs-repo-3.0.3-3ent.amp GoogleDocs Repository AMP
    alfresco-share-services.amp Share Services AMP

    The /amps_share directory contains the following file:

    File name Description
    alfresco-googledocs-share-3.0.3-3ent.amp GoogleDocs Share AMP

    The following files are contained within the suggested subdirectories for within the Tomcat application server:

    /bin

    File name Description
    alfresco-mmt.jar Alfresco Module Management Tool (MMT).
    alfresco-spring-encryptor.jar Alfresco Encrypted Properties Management tool
    apply_amps.bat Windows batch file for Tomcat application server installs, used to apply all AMP files in the <installLocation> directory.
    apply_amps.sh Linux script file for Tomcat application server installs, used to apply all AMP files in the <installLocation> directory.
    clean_tomcat.bat Windows batch file for cleaning out temporary application server files from previous installations.
    clean_tomcat.sh Linux script for cleaning out temporary application server files from previous installations.

    The /java directory contains files and folders relating to Java.

    The /modules directory contains the following directories:

    platform
    share

    You can put simple JAR modules in these folders, and they are loaded when Alfresco starts up. See Simple Module for more information.

    The /postgresql directory contains files and folders relating to PostgreSQL.

    The /scripts directory contains environment scripts.

    The /solr4 directory contains the following files and folders:

    File name Description
    /alfrescoModels This directory contains all the content models that come out of the box with Alfresco. Any new custom content model added to Alfresco are synced to this directory so that Solr 4 knows about it.
    /archive-SpacesStore Configuration directory for the archive core.
    context.xml Configuration file specifies the Solr web application context template to use when installing Solr in separate tomcat server.
    /lib This directory contains extra libraries that Solr 4 loads on start up. These libraries are used to communicate with Alfresco by using CMIS, Alfresco data model or Alfresco Surf Web Scripts.
    log4j-solr.properties Configuration file for Solr 4-specific logging.
    solr.xml Configuration file which specifies the cores to be used by Solr 4.
    /templates
    /workspace-SpacesStore Configuration directory for the workspace core.

    The /tomcat directory has a standard Tomcat structure, including /shared and /webapps directories.

    The /shared directory contains the Alfresco configuration files:

    File name Description
    /classes/alfresco-global.properties.sample The global properties file, which is used for Alfresco configuration properties.
    /classes/encrypted.properties An encrypted properties overlay file.
    /classes/alfresco Contains the Alfresco directory structure for the configuration override files, including the extension and web-extension directories.

    /webapps

    File name Description
    _vti_bin.war SharePoint site dispatcher
    alfresco.war The Alfresco WAR file
    ROOT.war Application for the server root
    share.war The Alfresco Share WAR file
    solr4.war The Solr 4 WAR file
  3. Move the WAR files from /webapps to the appropriate location for your application server.

    For example, for Tomcat, move the WAR files to the <TOMCAT_HOME>/webapps directory.

    Note:If you are using JBoss, you must customize the web.xml file in the _vti_bin.war, ROOT.war and share.war files to include this code fragment:

    <context-param> 
       <param-name> 
          org.jboss.jbossfaces.WAR_BUNDLES_JSF_IMPL 
       </param-name> 
       <param-value>true</param-value
    </context-param>

    This ensures that the JSF deployer in JBoss uses its own bundled JSF version, and allows AOS to deploy successfully.

  4. Remove all directories named _vti_bin, alfresco, ROOT, share and solr4 in <TOMCAT_HOME>/webapps.

    If you do not remove these directories, then the WAR files will not be deployed when the server starts.

  5. Edit the /shared/classes/alfresco-global.properties.sample file with your configuration settings.
  6. Save the file without the .sample extension.
  7. Move the alfresco-global.properties file to <classpathRoot>.

    For example, <TOMCAT_HOME>/shared/classes.

Note:If you deployed previous versions of Alfresco, you must remove any temporary files created by your application server. Use theclean_tomcat.bat or clean_tomcat.sh command

 

Configuration

Here are the file locations for Alfresco configurations / extensions / plugins.

<classpathRoot> directory (Windows) – The <classpathRoot> is a directory whose contents are automatically added to the start of your application server classpath. The location of this directory varies depending on your application server. For example:

  • (Tomcat) C:\Alfresco\tomcat\shared\classes

<classpathRoot> directory (Linux) – The <classpathRoot> is a directory whose contents are automatically added to the start of your application server classpath. The location of this directory varies depending on your application server. For example:

  • (Tomcat) tomcat/shared/classes/

alfresco-global.properties file – The alfresco-global.properties file is where you store all the configuration settings for your environment. The file is in Java properties format, so backslashes must be escaped. The file should be placed in <classpathRoot>. When you install Alfresco using the setup wizard, an alfresco-global.properties file is created, which contains the settings that you specified in the wizard. An alfresco-global.properties.sample file is supplied with the setup wizard and also with the WAR zip file. This.sample file contains examples of common settings that you can copy into your alfresco-global.properties file.

<extension> directory – The <extension> directory is where you store Spring configuration files that extend and override the system configuration. This directory can be found at <classpathRoot>\alfresco\extension.

<web-extension> – The <web-extension> directory is where you store Spring configurations that extend and override the system Share configuration. This directory can be found at <classpathRoot>\alfresco\web-extension.

<solrRootDir> – The <solrRootDir> directory is the Solr home directory which contains the Solr core directories and configuration files. This directory can be found at <ALFRESCO_HOME>\solr4.

<configRoot> – The <configRoot> directory contains the default application configuration. For example, for Tomcat, <configRoot> is<TOMCAT_HOME>\webapps\alfresco\WEB-INF.

<configRootShare> – The <configRootShare> directory contains the default application configuration for Share. For example, for Tomcat,<configRootShare> is <TOMCAT_HOME>\webapps\share\WEB-INF.

Website

 

Tutorials

  • ecmarchitect.com

    Jeff Potts’ blog on Alfresco, content management, BPM, and search

    Source: ecmarchitect.com/

References