I've been absent from my project blog for the past week because I've been working on a submission to OOPSLA 2012. Now that the paper has been submitted, I can get back to working on this project.
As of today, I've completed GPU accelerated versions of the following algorithms: sort, find, min_element, max_element, set_union, set_intersection, set_difference, and set_symmetric_difference. I plan to still implement foreach, count, and equal. Once finished, I'll continue to explore runtime heuristics, analyze the performance of each algorithm, and attempt to optimize the GPU performance.
I've also gained ssh access to a server with a better GPU than the one I've been using for testing. I'll be able to get performance results and have a comparison of at least the two GPUs. I may also try to analyze a few more GPUs in the graphics lab, but this is contingent on being able to use the previously developed STL code with those machines because the current implementation is gcc version specific.