Ephesoft + Dropbox: What about importing?

My previous blog post was explaining how to create an export plugin, focused on Dropbox. After that, I started to think about importing. An import module is not so easy. The custom workflow management feature proposed by Ephesoft starts effectively after the import. Howevere, there is different ways of importing documents into Ephesoft, for example by email or using CMIS. And Ephesoft provides a CMIS import feature, but strictly speaking, it's not really a module.  So, we are going to find how to do an import feature from Dropbox. This feature has been inspired by the CMIS importer as well. This import feature has been embedded in the Dropbox plugin that I created, but it could be deployed as simple JAR file.

Context file

First, we are going to create a new Spring context file. In this file, we are going to define a bean and a scheduler to execute regularly the same code.

<?xml version="1.0" encoding="UTF-8"?>
<beans default-autowire="byName">

	<task:annotation-driven />
	<util:properties id="dropboximport" location="classpath:/META-INF/dcma-dropboximport-plugin/dcma-dropboximport-plugin.properties" />
    <context:property-placeholder properties-ref="dropboximport" />
    
	<bean id="dropboxImporter" class="com.bataon.ephesoft.dcma.dropbox.DropboxImporter">
		<property name="appKey" value="#{dropboximport['dropbox.appKey']}" />
		<property name="appSecret" value="#{dropboximport['dropbox.appSecret']}" />
		<property name="appCode" value="#{dropboximport['dropbox.appCode']}" />
		<property name="batchClassConfig" value="#{dropboximport['dropbox.batchClassConfig']}" />
		<property name="batchClassService" ref="batchClassServiceImpl" />
	</bean>

	<task:scheduled-tasks>
		<task:scheduled ref="dropboxImporter" method="dropboxImport" cron="#{dropboximport['cron.expression']}"></task:scheduled>
	</task:scheduled-tasks>

</beans>

The first XML tag task:annotation-driven has been added to be able to define the scheduling in the Java class, and we'll discover how lately. Next, we define a reference to a property file. This property file will be used to configure this import process. Unlike the CMIS import, we can't add simply and dynamically a user interface to be able to configure it. We'll see how you have to define the configuration file to be valid with this import plugin. And, finally, we create the schedule task. We just configure Ephesoft to execute the same method in the Dropbox importer using a CRON expression.

Property file

The property file is quite easy to understand. The first time is to define the application secret and the application key. You can get these information when you create an application in the Dropbox developer console. The first time that you'll start Ephesoft with this module, you'll get an error and you'll get a link on Dropbox to allow the application that you created to allow access to your account. When you'll have been defined the application code, the value for the token will be automatically saved. And this token will be re-used for all next requests. These 4 four properties are used for the authentication.

The next one cron.expression is used to define the scheduling. In our case, we define the CRON expression to call the dropbox importer every 30 seconds. The property dropbox.batchClassConfig is used to define which batch classes will be used by the dropbox import. In this example, we have only the batch class BC5. But you can easily define more than one batch class separated by the character ;.

For each batch class, you need to define 3 properties. First, you need to define which folder will be tracked in Dropbox using the suffix folder. Next, you have to define the pattern that should match the file name. In this case, we specify that we want to import only PDF files. And finally, after the import process, we need to tag these files as already imported to avoid to import an other time this document. In our case, we use the action moveTo. When the import process is finished, the file in Dropbox is moved to an other folder. And this folder is not tracked by Ephesoft. An other available action is delete. This action delete the file in Dropbox after the import.

dropbox.appSecret=XXX
dropbox.appKey=XXX
dropbox.appCode=XXX
token=YYY
cron.expression=*/30 * * * * ?

dropbox.batchClassConfig=BC5
BC5.folder=/Import/HR
BC5.pattern=.*\.pdf
BC5.action=moveTo|/Import/HR/Completed

Java Class

So, we need to define our Dropbox importer and conforms to our bean definition:

public class DropboxImporter {

	private static final Logger LOGGER = LoggerFactory.getLogger(DropboxImporter.class);

	private String appKey;
	private String appSecret;
	private String appCode;
	private String batchClassConfig;
	
	private BatchClassService batchClassService;

	public void dropboxImport() throws DCMAException {
		...
	}

	private void importFilesFromDropbox(DbxEntry entry, String bcIdentifier, DbxClient client) throws Exception {
		...
	}

}

Next, we need to define the import method. This method will be callled by the Spring scheduler. This method is just in charge of parsing the configuration file and call an other method to do effectively the import.

public void dropboxImport() throws DCMAException {
	DropboxHelper helper = new DropboxHelper(getProperties());
	helper.setPluginName("dcma-dropboximport-plugin");
	DbxClient client = helper.authenticateApp();

	String[] bcs = getBatchClassConfig().split(";");
	for (String batchClassIdentifier : bcs) {

		LOGGER.debug("Importing the batch class " + batchClassIdentifier);

		try {
			String dropboxFolder = getProperty(batchClassIdentifier, "folder");
			String filePattern = getProperty(batchClassIdentifier, "pattern");
			String action = getProperty(batchClassIdentifier, "action");
			String actionName = action.split("\\|")[0];

			LOGGER.debug(" - Dropbox folder: " + dropboxFolder);
			LOGGER.debug(" - File pattern: " + filePattern);

			DbxEntry.WithChildren listing = client.getMetadataWithChildren(dropboxFolder);
			LOGGER.debug(" - Files in the folder:");
			if (listing != null) {
				for (DbxEntry child : listing.children) {
					System.out.println(" -- " + child.name);
					
					if (child.name.matches(filePattern)) {
						System.out.println(" -- Import file");
						importFilesFromDropbox(child, batchClassIdentifier, client);
						
						if (actionName.equalsIgnoreCase("moveTo")) {
							String targetFolder = action.split("\\|")[1];
							client.move(child.path, targetFolder + "/" + child.name);
						} else if (actionName.equalsIgnoreCase("delete"))
							client.delete(child.path);
							
					}
				}
			}
		} catch (Exception e) {
			e.printStackTrace();
		}
	}
}

Finally, we define the method in charge of importing documents on the UNC folder of the relevant batch class.

private void importFilesFromDropbox(DbxEntry entry, String bcIdentifier, DbxClient client) throws Exception {
	String fileName = entry.name;
	if (null != fileName) {
		int indexOfDot = fileName.indexOf('.');
		if (-1 != indexOfDot) {
			BatchClass batchClass = batchClassService.getBatchClassByIdentifier(bcIdentifier);
			
			if (batchClass != null) {
				String uncFolder = batchClassService.getBatchClassByIdentifier(bcIdentifier).getUncFolder();
				String destinationFolder = EphesoftStringUtil.concatenate(new Object[] { uncFolder, File.separator, fileName.substring(0, indexOfDot), Character.valueOf('_'),
						Long.valueOf(System.currentTimeMillis()) });
				String destinationFile = EphesoftStringUtil.concatenate(new String[] { destinationFolder, File.separator, fileName });
				
				File folder = new File(destinationFolder);
				if (!folder.exists())
					folder.mkdir();
				
				FileOutputStream outputStream = new FileOutputStream(destinationFile);
				client.getFile(entry.path, null, outputStream);
			} else
				LOGGER.debug(bcIdentifier + " doesn't exist.");
		}
	}
}

Conclusion

This import feature works well but it's not so easy to configure. The main reason is that you don't have any UI to configure it. You need to create a file at the right place in the server. But if you know how to do it, this process is quite useful.

The use of Dropbox can be interesting for some reasons. Let's imagine that you want to build a batch class to process expenses. To decrease the time to process small expenses like restaurant receipts, you can propose to your users to take a picture of their receipts and upload them in Dropbox. In terms of security, you just open a tunnel between your Ephesoft instance and Dropbox. And a lot of other use caes can be imagined.

 

Add new comment