I’ve recently been doing a small project that involves Python and Javascript code, and I keep tripping up on the differing syntax of their join() functions. (As well as semicolons, tabs, braces, of course.) join() is a simple function that joins an array of strings into one long string, sticking a separator in between, if you want.
So, join(["this","that","other"],"_") returns "this_that_other" . Pretty simple.
Perl has join() as a built-in, and it has an old-school non object interface.
1 |
my $foo_string = join(",",@bar_array); |
Python is object-orienty, so it has an object interface:
1 |
foo_string = ",".join(bar_array) |
What’s interesting here is that join is a member of the string class, and you call it on the separator string. So you are asking a "," to join up the things in that array. OK, fine.
Javascript does it exactly the reverse. Here, join is a member of the array class:
1 |
var foo_string = bar_array.join(","); |
I think I slightly prefer Javascript in this case, since calling member functions of the separator just “feels” weird.
I was surprised to see that C++ does not include join in its standard library, even though it has the underlying pieces: <vector> and <string>. I made up a little one like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
template<class iterable_string_t> class joinable_thing_t : public iterable_string_t { public: std::string join(const std::string &sep) { std::string result; for (typename iterable_string_t::const_iterator it = this->begin(); it != this->end(); it++) { result += *it; if (it != this->end() - 1) result += sep; } return result; } }; |
You can see I took the Javascript approach. By the way, this is how they do it in Boost. Boost avoids the extra compare for the separator each time by handling the first list item separately.
Using it is about as easy as the scripting languages:
1 2 3 |
joinable_thing_t<std::vector<std::string> > bar_array; ... std::string foo_string = bar_array.join(","); |
I can live with that, though the copy on return is just a C++ism that will always bug me.
Finally, I thought about what this might look like back in ye olden times, when we scraped our fingers on stone keyboards, and I came up with this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
size_t join(const char **strings, unsigned len, const char *sep, char *tgt = 0) { unsigned total = 0; unsigned i = 0; unsigned seplen = strlen(sep); const char *src; char *cptr = tgt; if (len) { i++; total = strlen(strings[i]); if (tgt) { src = strings[i]; while (*src) *cptr++ = *src++; } } while (i<len) { total += seplen + strlen(strings[i]); if (tgt) { src = sep; while (*src) *cptr++ = *src++; src = strings[i]; while (*src) *cptr++ = *src++; } i++; } if (tgt) *cptr = 0; return total; } |
Now that’s no beauty queeen. The function does double-duty to make it a bit easier to allocate for the resulting string. You call it first without a target pointer and it will return the size you need (not including the terminating null.) Then you call it again with the target pointer for the actual copy.
1 2 3 4 5 6 7 8 9 10 |
/* here's an array of strings */ const char *bar_array[] = { "this", "that", "other", }; /* allocating for the string */ char *foo_string = (char *)malloc(join(bar_array,3,",") + 1); /* now, copy the string */ join(bar_array,3,",",foo_string); |
Of course, if any of the strings in that array are not terminated, or if you don’t pass in the right length, you’re going to get hurt.
Anyway, I must have been bored. I needed a temporary distraction.