Recent Posts

NGit Tutorial

I recently decided to use the NGit library to interact with a Git repository as part of a ServiceStack project that I'm working on.

Why NGit?

That's a great question considering there's the awesome libgit2 library available. Unfortunately, the library doesn't support doing pull/fetch/merge according to this open issue.

NGit is a semi-automated port of the JGit library from Java over to .Net and it's maintained by the Mono team. And although it's kind of frustrating to use, it supports all of Git's feature set and generally works just fine once you figure it all out.

Since I had a hard time finding examples and documentation for NGit, I'm posting some code snippets and explanations for common features I needed to use in the hopes that it helps some other developer out there in the future.

NGit Documentation Unit Tests

NGit follows the common "unit tests as documentation" pattern, so I encourage any developer to clone a copy of the NGit repository and look through their unit tests as a way to explore their API when you need to see the usage pattern of a particular command/feature.

Git Commands

NGit uses a command-based API that is built off of a Git class which is returned when you initialize, open, or clone a repository. Commands generally return themselves, so you end up chaining commands similar to working with jQuery.

Cloning a Repository

To clone a repository, you need to create a CloneCommand and set at least the local directory target for the clone on your local disk and the URI for the origin that you're cloning from:

// Let's clone the NGit repository
var clone = Git.CloneRepository()
    .SetDirectory(@"C:\Git\NGit")
    .SetURI("https://github.com/mono/ngit.git");

// Execute and return the repository object we'll use for further commands
var repository = clone.Call();

Specifying Credentials

For simple HTTP/HTTPS credentials, you will generally create a UsernamePasswordCredentialsProvider object and either set it on the command you're calling or set it as the default up front before executing any command:

var credentials = new UsernamePasswordCredentialsProvider("username", "password");

// On a per-command basis
var fetch = repository.Fetch()
    .SetCredentialsProvider(credentials)
    .Call();

// Or globally as the default for each new command
CredentialsProvider.SetDefault(credentials);

If you need to use SSH with private key authentication, things get a little more complicated. I will cover the solution to that at the bottom of my post.

Opening an Existing Repository

Opening an existing repository is simple:

var repository = Git.Open(@"C:\Git\NGit");

Fetch, Pull, Status, Clean, Add, Remove

Most commands are fairly simple:

// Fetch changes without merging them
var fetch = repository.Fetch().Call();

// Pull changes (will automatically merge/commit them)
var pull = repository.Pull().Call();

// Get the current branch status
var status = repository.Status().Call();

// The IsClean() method is helpful to check if any changes
// have been detected in the working copy. I recommend using it,
// as NGit will happily make a commit with no actual file changes.
bool isClean = status.IsClean();

// You can also access other collections related to the status
var added = status.GetAdded();
var changed = status.GetChanged();
var removed = status.GetRemoved();

// Clean our working copy
var clean = repository.Clean().Call();

// Add all files to the stage (you could also be more specific)
var add = repository.Add()
    .AddFilePattern(".")
    .Call();

// Remove files from the stage
var remove = repository.Rm()
    .AddFilePattern(".gitignore")
    .Call();

Reset

If we fetched changes from origin/master and want to reset our current branch to match:

var reset = repository.Reset()
    .SetMode(ResetCommand.ResetType.HARD)
    .SetRef("origin/master")
    .Call();

Commit, Push

To commit and push a change, you would do the following:

var author = new PersonIdent("Lance Mcnearney", "lance@mcnearney.net");
var message = "My commit message";

// Commit our changes after adding files to the stage
var commit = repository.Commit()
    .SetMessage(message)
    .SetAuthor(author)
    .SetAll(true) // This automatically stages modified and deleted files
    .Call();

// Our new commit's hash
var hash = commit.Id;

// Push our changes back to the origin
var push = repository.Push().Call();

Private Key Authentication Using SSH

Credit for figuring out how to wire up SSH authentication using a private key goes to Doug and his Stack Overflow question:

My implementation of it does not require the public key but as a trade-off you must make sure to specify the username in the ssh:// URI. I also wipe out the GIT_SSH environment variable as Jsch will use that instead of using the configured JschConfigSessionFactory when initializing its SSH connection. In my case, it will calling TortoisePlink.exe.

You must create a custom JschConfigSessionFactory class:

/// <summary>
/// Handles setting up the public key authentication when using a remote SSH repository
/// </summary>
public class PrivateKeyConfigSessionFactory : JschConfigSessionFactory
{
    private string PrivateKeyPath { get; set; }

    public PrivateKeyConfigSessionFactory(string privateKeyPath)
    {
        PrivateKeyPath = privateKeyPath;

        // Clear the GIT_SSH environment variable as NGit will use it for SSH transport instead of the session factory
        Environment.SetEnvironmentVariable("GIT_SSH", string.Empty, EnvironmentVariableTarget.Process);
    }

    protected override void Configure(OpenSshConfig.Host hc, Session session)
    {
        var config = new Properties();
        config["StrictHostKeyChecking"] = "no";
        config["PreferredAuthentications"] = "publickey";
        session.SetConfig(config);

        var jsch = GetJSch(hc, FS.DETECTED);
        jsch.AddIdentity("KeyPair", File.ReadAllBytes(PrivateKeyPath), null, null);
    }
}

Once you have your custom class, you can then configure NGit/Jsch to use it:

// Use our custom SSH session when accessing remote SSH:// repositories
// The username must be in the repository Uri: ssh://git@host/var/git/repo.git
var privateKeyPath = @"C:\Git\private.key";
var factory = new PrivateKeyConfigSessionFactory(privateKeyPath);
SshSessionFactory.SetInstance(factory);

Cleaning up after NGit

Since NGit is a port from Java, it doesn't implement IDisposable when accessing files. To remove its lock on files, you can dispose of the Git object by doing the following:

// Handle disposing of NGit's locks
repository.GetRepository().Close();
repository.GetRepository().ObjectDatabase.Close();
repository = null;

You may also want to recursively remove any read-only file attributes set by NGit in the repository's path if you need to remove the repository later or you will receive permission exceptions when attempting to do so.

var files = Directory.GetFiles(@"C:\Git\NGit", "*", SearchOption.AllDirectories);

// Remove the read-only attribute applied by NGit to some of its files
foreach (var file in files)
{
    file.Attributes = FileAttributes.Normal;
}

Categories: .Net, Git, NGit

.Net SFTP Libraries

We're integrating SFTP capabilities into a new project at work. Since there is no built-in support for SFTP in the .Net framework, I researched available libraries (both free and commercial) and here are the results.

Rebex SFTP - Commercial, $300/developer

The Rebex component ended up being the best library available for our needs. The API was clean, software stable, and it came with a number of well-written sample applications. Their forum site appears to run on Stack Exchange (either a private license or clone), which is pretty handy. They also had a simple licensing model (per developer) and the only difference between the trial and licensed library is the binary itself. You don't have to install the program to the GAC, activate it, embed a license key in your application config, etc.

Other Commercial Libraries (SecureBlackbox, IPWorks SSH, wodSFTP.Net)

I'm going to skip the other commercial libraries as they failed for one (or more) of the following reasons:

  • Horrible API (generally the result of offering the same API across Java, C++, etc)
  • Lack of documentation
  • Too expensive
  • Complex licensing schemes

SharpSSH - Open Source, Free

SharpSSH was the first open source library I found since it was mentioned multiple times on Stack Overflow and other sites. Unfortunately, the project looks abandoned and is hosted on SourceForge (the Geocities of open source projects). I wasn't able to download a copy (SF was having issues) and ended up avoiding the library since there were enough negative comments about using it (not maintained, not fully implemented, straight Java porting of code, etc).

SSH.Net - Open Source, Free

SSH.Net seems to be the modern day replacement for SharpSSH. The project is marked as "stable" on their CodePlex site but I found it unstable even with my basic LinqPad testing (uploads would randomly error out on a solid connection). It is actively developed and I believe with some time to mature that it will probably become a popular library. It also lacks any kind of documentation, wanting you to read unit tests to discover the API (this seems to be a common theme with some open source projects).

WinSCP .Net Wrapper - Open Source, Free

I was really excited when I came across the .Net wrapper for WinSCP. WinSCP has been around a long time and is one of the most popular clients for SCP/SFTP file transfers. Unfortunately, the wrapper's documentation specifically states that it's not meant for interactive use (like in a GUI program) which we were building. If our requirements had been different (DevOps work), it probably would have worked great.

Categories: .Net, SFTP

Server Migration

Trading Linode for Digital Ocean

My website is now hosted on a Digital Ocean VPS. I made the move after reading this thread on Hacker News:

I'm satisfied with the service so far. I was paying $20/month for the smallest Linode but am only paying $5/month for a similar VPS on Digital Ocean. Their website and tools are pretty basic and they don't offer nearly as many features as Linode but I believe they'll get there in time (judging from their blog and community forums).

I never had a problem with Linode and would not hesitate to recommend them but I couldn't pass up a 75% price savings. :)

Django Updates: Paying Off Technical Debt

My site was originally built using Django 1.2 back in 2009. Since then, they've released major updates every year and I neglected to upgrade because the site was stable and running along fine. Since I was moving to a new server, it felt like the right time to pay off some technical debt by upgrading.

Some notes:

  • When developing using Django, you should enable deprecation warnings so you have a heads up on upcoming features that will be removed. By default they're disabled on Python 2.7.
  • Django's release notes tend to bury deprecation warnings in a huge wall of text. Luckily, I was able to just search on each error that popped up while migrating and could find threads with easy fixes.
  • Third party modules caused all sorts of headaches. For example, the sorl-thumbnail module mysteriously stopped generating thumbnail images to disk and wouldn't output any type of error. I found other developers with the same mysterious problem and ended up switching to easy-thumbnails. That meant modifying all of my templates to use a different template tag syntax, etc.
  • You should really keep some documentation for all of the dependencies of your site. I had neglected that originally and ended up having to hunt for apt packages, pip modules, and even ruby gems.
  • I've put off upgrading to the recently released Django 1.5 because they removed function-based generic views and the documentation isn't very helpful for migrating them to the class-based versions. I'll figure it out eventually...

I have pushed my updated website source code to GitHub at the following URL:

PyCharm 2.7 & VirtualBox

Jetbrains released PyCharm 2.7 and I must say it's pretty awesome. I may be a little biased since I spend most of my day using Visual Studio with ReSharper and PhpStorm at work but they've made some huge improvements over the first initial releases of the IDE.

You can setup your project to use a remote Python interpreter. That means you can setup a VirtualBox Linux host manually (or with something like Vagrant) and then have PyCharm SSH into the remote host and automatically detect the version of Python and all of its modules (Django, etc). I also setup a shared folder with my host so that as I edited files they were automatically updated on the server and Django's built-in web server would detect the changes and refresh itself. This helped a lot because I could develop on Windows but the site was running on a Linux server that matched my eventual production server.

Categories: Django, Linux, Nginx, PyCharm, Python, VPS, VirtualBox

PHP Memcached RES_PROTOCOL_ERROR?

In case anyone else runs into this issue - if you're using the Memcached module for PHP and start mysteriously receiving the following result code:

RES_PROTOCOL_ERROR - 8

You may want to check that the key you're using doesn't contain any spaces in it, like the following (which was my verbose way of creating a cache key for a service call's results):

Categories->GetProducts(12345, 25, true);

If it does, it seems that they break the ASCII protocol when setting or getting the key (confirmed by turning "very verbose logging" on for the memcached server). I would have assumed that the value was hashed on the client before being sent. It's an easy fix though - just hash the key with something like md5.

It is documented in the Memcached's project documentation (although it's kind of obscure):

Avoid User Input

It's very easy to compromise memcached if you use arbitrary user input for keys. The ASCII protocol uses spaces and newlines. Ensure that neither show up your keys, live long and prosper. Binary protocol does not have this issue.

Categories: Linux, Memcached, PHP

Compiling Nginx With Cache Purging Support

I've been experimenting with different caching methods ever since reading this post on the performance differences between no caching, built-in Django caching, memcached, static files, and Varnish. Varnish and static files were the fastest by a large margin.

Since I'd like to avoid adding another proxy layer to my setup (already using both Nginx and Apache), I looked into the caching capabilities built-in to Nginx. It seems that originally if you wanted to use caching with Nginx you used ncache. This is no longer the case since that project has been built into the Nginx core.

To add support for purging a cached item, you need to install the ngx_cache_purge module. Unfortunately, installing an Nginx module isn't as easy as installing an Apache module. It requires a re-compile and when you're used to just installing your software through apt-get, that can be a little intimidating. I also didn't want to lose the special setup that the package installer provided including using the /sites-available/ and /sites-enabled/ folders, the init script, etc.

Here's a quick guide on re-compiling while still remaining compatible with the default package:

Install Nginx through aptitude

This is really easy (and you probably already did this):

apt-get install nginx

Install compile tools

A few of the compile libraries needed by Nginx aren't installed by default. To install them, use the following command:

aptitude -y install build-essential libc6 libpcre3 libpcre3-dev libpcrecpp0 libssl0.9.8 libssl-dev zlib1g zlib1g-dev lsb-base

Download and extract the source

Download and extract the source of both the newest stable Nginx and the cache_purge module:

cd /usr/src/
wget http://www.nginx.org/download/nginx-0.7.65.tar.gz
wget http://labs.frickle.com/files/ngx_cache_purge-1.0.tar.gz
tar -xvf nginx-0.7.65.tar.gz
tar -xvf ngx_cache_purge-1.0.tar.gz
cd nginx-0.7.65/

Configure compile options

You can view the packaged Nginx's compile options by using:

nginx -V

For my server, I kept generally the same options but added the cache_purge module to the end using --add-module:

./configure --sbin-path=/usr/sbin --conf-path=/etc/nginx/nginx.conf 
--error-log-path=/var/log/nginx/error.log --pid-path=/var/run/nginx.pid 
--lock-path=/var/lock/nginx.lock --http-log-path=/var/log/nginx/access.log 
--http-client-body-temp-path=/var/lib/nginx/body 
--http-proxy-temp-path=/var/lib/nginx/proxy 
--http-fastcgi-temp-path=/var/lib/nginx/fastcgi --with-debug 
--with-http_stub_status_module --with-http_flv_module --with-http_ssl_module 
--with-http_dav_module --with-ipv6 --add-module=/usr/src/ngx_cache_purge-1.0

This command will write out a summary of the configured options.

Compile Nginx

Compiling Nginx can be done with this command:

make && make install

A lot of text will scroll by during the compilation. In my case, I hadn't stopped my Nginx service before compiling and an error occurred when the install process attempted to copy the new Nginx executable over the current one. To fix this, I had to manually stop, copy and restart Nginx:

/etc/init.d/nginx stop
cp objs/nginx /usr/sbin/nginx
/etc/init.d/nginx start

If everything worked, you can view the current version and compile information of the running Nginx by using:

nginx -V

And the final result should be:

nginx version: nginx/0.7.65
built by gcc 4.4.1 (Ubuntu 4.4.1-4ubuntu8)
TLS SNI support enabled
configure arguments: --sbin-path=/usr/sbin --conf-path=/etc/nginx/nginx.conf 
--error-log-path=/var/log/nginx/error.log --pid-path=/var/run/nginx.pid 
--lock-path=/var/lock/nginx.lock --http-log-path=/var/log/nginx/access.log 
--http-client-body-temp-path=/var/lib/nginx/body 
--http-proxy-temp-path=/var/lib/nginx/proxy 
--http-fastcgi-temp-path=/var/lib/nginx/fastcgi --with-debug 
--with-http_stub_status_module --with-http_flv_module 
--with-http_ssl_module --with-http_dav_module --with-ipv6 
--add-module=/usr/src/ngx_cache_purge-1.0

Categories: Linux, Nginx