A few years ago I was hired at a local credit union right before they were getting ready to convert to a different Core Account Processor. My primary job at the time was to make sure all of the desktops were up to date so that the new software would work… My role changed a little after the conversion.
Getting the Member Statements from the Old Processor
We probably waited months before we received the statements from the old processor. The format in which we received them was absolutely grotesque. There was one file per year. What I mean is that EVERY member statement for EVERY member for a specific year was in one text file. These files were about 512 mb which meant that Notepad couldn’t open them, non of our employees could open them. In the beginning I had to install Notepad++ and order more RAM for my computer just so I could retrieve a statement for someone. The old processor had recommended that we pay several hundred dollars for a piece of software that should be able to open these files. That was not going to be an option for us. So it was decided I would use my computer science and python skills to try to parse these statements.
Parsing Those Member Statements
At first glance, using notepad++, these files had nothing that made it obvious when a statement was finished. Also, important info like member name and account number was not always in the same place on the statement, so capturing that information was incredibly difficult. Since Notepad++ was also chugging trying to open these files, I installed gVim, which is an offshoot of linux’s vim, made for Windows. gVim was able to see a lot more data in these files and I was able to find something placed in between every statement. At the end of each statement was a hidden escape character. I had something to parse on now! So I wrote my python script to dump each statement to a file and named it by account number. This process still wasn’t easy. Getting those account numbers was extremely difficult. So I had to write a lot of code to constantly correct for the location of the account number in each statement. I named each statement in the following format, ####MMYYYY.txt where # equals the account number, M = month, and Y = year. Anyways, I was able to get over 1 million statements parsed.
You Have All These Files, Now What?
Once the files were parsed, I tried to make them available to the employees, so I wouldn’t have to constantly take in requests for a statement. So I placed them on our server, which was a big mistake. Windows had a LOT of troubles trying to display that many files in Explorer. I tried putting all of the files in folders by year and month, but it didn’t work. The server would get laggy, people’s desktops would get laggy trying to load all of those files. So my solution? Build a web app.
The Web Application: Version 1
I felt that IIS and Windows wasn’t going to be a good solution to start this, mainly because getting Windows to do CGI scripts in that past, was kind of a pain. So I found an old desktop and installed Ubuntu Server, Apache2, and Python. I knew that writing a script to search the file system was going to be incredibly I/O heavy, especially for this desktop machine that was only a P4 and had 1GB of RAM, so I knew MySQL would have to come in to play for this. So I wrote another Python script to go through all of the statements. This script recorded each file’s account number, month, year, and file location, then uploaded that information to the database.
So to allow employees access to this information, I wrote a Python CGI script that would allow them to type in a member’s account number and select a year. Then the script would display all of the available files for that search in a table, with a link to each file.
For security reasons, I setup Apache to authenticate with our Active Directory server, so if someone from outside, somehow managed to connect to our network, they wouldn’t be able to access those files without logging in.
Also, this server WAS NOT accessible from outside the credit union. You SHOULDNT DO THAT UNLESS YOU KNOW WHAT YOURE DOING.
This proved to work fairly well, but I never really liked it, it felt clunky, I didn’t feel it was secure enough, so a year and a half later, I upgraded it.
The Web Application: Version 2
Since the original version of the web application, I had learned more about MySQL and PHP from my computer science classes. I decided to rewrite the entire application in PHP. This allowed me to add more features to the application, such as the ability for employees to upload more files for members, such as signature cards, etc. Since I already had the database setup, I just had to rewrite the code, to make the application look better, and also switching to PHP caused the application to actually run a lot faster than its Python counterpart.
For authentication, I was able to go away from using Apache’s authentication configuration and wrote the authentication right in the application. The application still authenticates with AD. Another security feature I was able to add, was logging. I now keep logs of who accesses what statements, and who fails at logging in (using a wrong password) and who successfully is able to log in. Since I had changed how the application logged in, I had to lock down the directories that stored the statements, so they wouldn’t be accessible from the web. I had to change how the application provided the statements. The application now would have to read the requested statement and print it to the screen, rather than letting the users’ browser open a text file.
My solution is good for in house, and I would never feel comfortable placing it on the internet. This is an intranet only solution, and in my opinion, the members statement is completely secure.
If you would like more information, or even my assistance in setting something up like this, feel free to email me at support@rychannel.com.