Category Archives: Spring Integration

Spring Integration – FTP files not reprocessed and broken flow

In the last years I’ve done everything I could to avoid using FTP as a mechanism to pass data from one procedure to the other inside our company.

Tools such Spring Integration and RabbitMQ helped me a lot in this process (making my life so much better)

However we still have to use FTP as exchange mechanism with several external partners.

Spring Integration has a wonderful FTP module that covers all of my needs, however I found a case which required me some googling to shed light on.

I had this scenario: a partner produces a file every n minutes and store in its own FTP server, with a never changing file name. I regularly check the FTP, get the file locally and put it in a RabbitMQ exchange.

So, basically, I have this kind of configuration (not the actual configuration, just an example):



        




     
        
             
        
    




this works nicely, with an important exception: if, for any reason, the outbound gateway cannot deliver the file to RabbitMQ (for example if the server is unreacheable, or restarting, or in maintenance) then this is what happens (suppose that the interesting file is called foo.csv):

  1. the file foo.csv is moved from remote server to the local filesystem
  2. the files is passed from the FTP inbound to the RabbitMQ outbound
  3. the FTP inbound marks the file as “processed”
  4. the RabbitMQ outbound fails to deliver the message to the broker (so it doesn’t delete the local file)

this seems the wanted behaviour (and, it is!), anyway, the problems arises at the next polling cycle:

  1. the FTP inbound sees that he already have a local file named foo.csv, so he won’t dowload the new one from the remote server
  2. anyway, it doesn’t either process the local foo.csv because he marked it as already “processed”

at this point, or chain is stuck and won’t start working again unless we restart our application.

The solution comes from the documentation (as always!)

From that page you can learn that the FTP inbound uses a FileListFilter to decide which local file process. Such filter can be changed setting the local-filter attribute. The default  filter is an AcceptOnceFileListFilter, which, as stated by its name, accept a file only once.

The list of accepted files is stored in memory, and that’s why restarting the application makes our foo.csv file processed again (of course, in case you need it, there are solutions to it persistent)

Going back to our problem, the solution is to set the filter to an instance of the class AcceptAllFileListFilter.





        


With this configuration in place, the remote foo.dat file won’t be downloaded unless the local copy has been deleted, and the local copy will be deleted after the first polling cycle which can succesfully deliver its content to the RabbitMQ broker.