Tuesday, November 1, 2011

MySQL BLOB export application for Linux

In the past, present and I'm sure in the future I will make choices based on what seemed like a good idea at the time and live to regret the lack of imagination you had when the small thing you chose to do back then is now straining under load that you never thought it would have.

Naturally I am talking about the idea of embedding files into a MySQL database using BLOB fields. "It's fine because it's only a small amount of data and it's easier to backup and manage than using the filesystem... right?"

There is undoubtedly pro's and con's to embedding files into a database like this and I can tell you from experience that a single table holding file name, file size, mime type and the blob itself can easily and even relatively quickly store many gigabytes of data. What happens though when the client decides he wants to move on and take all his files he has stored with him.. but obviously want's them as files and not just a SQL dump?

There may be existing solutions and I know there is other options available but since that doesn't teach us anything and may not offer the functionality we need, I have decided to roll my own using c++.

This is coded and compiled on the latest copy of Ubuntu Linux. I don't pretend to ensure that it will work elsewhere or even on older versions. I'm sure it shouldn't be too difficult to port however, and as this example is very hard coded and inflexible you will definately want to get your hands dirty with the code to suit your situation. Also once again I am not going to cover installing the development libraries and so on... if you need help with that sort of thing let me know and I'll see what I can do.

It's also important to note that in this example I am connecting the uploads table and the clients table together (you should be able to read the SQL easy enough). I am doing this as this application will output an index file that lists what CRM client records the embedded files are related to. (So when a client asks for his files he will be impressed that you provide an index to make his job importing/using it elsewhere much easier)
Obviously if your system isn't a CRM or doesn't require such an index file you can remove this functionality all together.

Here goes. Fire up your favorite c++ editor/IDE and create a file called whatever you want.. (I called mine blobbackup) Oh... and don't you dare copy and paste. You won't learn anything doing that. :P

#include <mysql.h>
#include <stdio.h>
#include <stdlib.h>
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <sys/stat.h>

int main(int argc, char* argv[]) {
   
// Check if the database name was passed on the cmd-line and if not output usage instructions
// and terminate.
    if(argv[1] == NULL) {
        std::cout << "Usage :\n";
        std::cout << "./docbackup <database_name>\n\n";
        std::cout << "(Where '<database_name>' is the name of the MySQL database to connect to.)\n";
        return 1;
    }
   
    MYSQL *conn;
    MYSQL_RES *res;
    MYSQL_ROW row;

// Usernames and password changed to protect the guilty
// If you need somewhere to start to make this more useful, I'm sure you can
// work out how to pass the server, username and password on the cmd-line rather than
// being evil and hardcoding it. :)
    const char *server = "localhost";
    const char *user = "someuser";
    const char *password = "somepass";
    char *database = argv[1];
   

    std::ofstream myfile;
    std::ofstream logfile;

    conn = mysql_init(NULL);

    // Connect to the database
    if(!mysql_real_connect(conn, server, user, password, database, 0, NULL, 0)) {
        fprintf(stderr, "%s\n", mysql_error(conn));
        return 1;
    }
   
    // Send SQL query
    if(mysql_query(conn, "select uploads.content,uploads.name,LENGTH(uploads.content),clients.legalname,clients.tradingname from uploads LEFT JOIN clients ON uploads.client_id = clients.client_id where 1")) {
        fprintf(stderr, "%s\n", mysql_error(conn));
        return 1;
    }
   
    res = mysql_use_result(conn);
   
    // Create a directory for this database to export files into
    mkdir(argv[1],S_IRWXU | S_IRWXG | S_IROTH | S_IXOTH);
    std::stringstream f;
    f << argv[1] << "/" << "files.csv";
    std::string logfilename = f.str();
   
    // Output results
    logfile.open(logfilename.c_str(), std::ios::out);
    printf("files:\n");
    while((row = mysql_fetch_row(res)) != NULL) {
        std::string lname;
        std::string tname;
        if(row[3] != NULL)
            lname = row[3];

        if(row[4] != NULL)
            tname = row[4];
           
        std::stringstream s;
        s << "\"" << row[1] << "\"," << "\"" << lname << "\",\"" << tname << "\"" << std::endl;
        std::string log = s.str();
       
        std::stringstream of;
        of << argv[1] << "/" << row[1];
        std::string outfilename = of.str();
       
        myfile.open(outfilename.c_str(), std::ios::binary | std::ios::out );
        int size = strtol(row[2],NULL,0);
        myfile.write(row[0], size);
        myfile.close();
       
        printf("%s (%d)\n", row[1], size);
       
        logfile.write(log.c_str(), log.size());
       
    }
    logfile.close();
       
    // close connection
    mysql_free_result(res);
    mysql_close(conn);
   
    return 1;
}

// Compile on Ubuntu with : g++ -o blobbackup $(mysql_config --cflags) blobbackup.cpp $(mysql_config --libs)

As mentioned above, this should give the basic application that connects to the local MySQL database server and exports the BLOB's stored in an uploads table to the filesystem.

It's up to you to customize it from here to suit your own needs and I would urge you to make it more useful by passing the server, usernames, password and even the table and field names on the command line or from a config file.

I have already done so with my implementation and where you take it after that is up to you.

No comments:

Post a Comment