String swap algorithms

When I went for interview at Imperial College, my interviewer (Dr. P. Kelly) asked me quite a nice question, which (for some unearthly reason) I decided to follow up on afterwards.

The question was a simple one: given a string (e.g.) "examplesgnome" consisting of two substrings (in this case, "examples" and "gnome"), how can we swap the two substrings in-place? In the interview, we covered the third method in more detail, but touched upon the other two. Afterwards, I calculated their computational efficiency, and determined while they all have the same upper bound, one has a lower average running time.

Method one: reversal

This is one Dr. Kelly suggested, which is really rather neat. It consists of two steps: reversing the entire string, then reversing both substrings. For example:

  1. examplesgnome
  2. emongselpmaxe
  3. gnomeexamples

If we take

n

to equal the total number of characters,

a

to equal the number of characters in the smaller substring, and

b

to equal the number of characters in the larger substring, this has:

\left\lfloor \frac{n}{2} \right\rfloor + \left\lfloor \frac{a}{2} \right\rfloor + \left\lfloor \frac{b}{2} \right\rfloor

swaps, so we may as well say it's of linear complexity

O(n)

.

Method two: cycles

This one was also suggested by Dr. Kelly, and involves taking the first character of the larger substring out to a temporary location, then cyclically moving characters

b

places along into the generated space. Once the strings have been swapped (

n

swaps have taken place), we put the stored character back in the final space. For example:

  1. examplesgnome
  2. _xamplesgnome
  3. gxamples_nome
  4. gxa_plesmnome
  5. gxamplesmno_e
  6. gxampl_smnoee
  7. g_amplxsmnoee
  8. gnamplxsm_oee
  9. gnam_lxsmpoee
  10. gnamelxsmpoe_
  11. gnamelx_mpoes
  12. gn_melxampoes
  13. gnomelxamp_es
  14. gnome_xamples
  15. gnomeexamples

Since each character moves to its final position in one move, the algorithm is intuitively of complexity

O(n)

— but we can go further and say it's of absolute complexity

\sim (n + 2)

, taking the steps to remove the first character and fill the last space into account.

Method three: reduction

This is the method I came up with during the interview and, with some help, made work. It basically involves swapping the first smaller substring into its final position, then applying the same algorithm to the resulting subsubstrings of the second substring. For example:

  1. examplesgnome
  2. gnomelesexamp
  3. gnomeampexles
  4. gnomeexpamles
  5. gnomeexmaples
  6. gnomeexamples

This is quite a neat one to implement, and will perform

O(n)

swaps in the worst case (when the smaller substring is only one character long, as one character moves to its final position with each swap).

However, what if we consider the case "foobar", where the substrings are "foo" and "bar"? In this case, method three only takes 3 swaps. Obviously, this method is limited in the upper bound to

O(n)

swaps, but it seems it can do the operation in far fewer — as few as

\Omega (a)

.

I haven't got a concrete proof for this, but I believe this is actually of complexity:

\sim (a \left\lfloor \frac{b}{a} \right\rfloor + (b \bmod{a}) \left\lfloor \frac{a}{b \bmod{a}} \right\rfloor)

as the lengths of the substring for each iteration

i

depend on the remainder of

\frac{b}{a}

for the

i - 1

th iteration. This complexity can be manipulated to prove that it's also of complexity

O(n)

:

All-in-all, I'm quite pleased with the interview: even if I'm not successful, it's given me an interesting problem to consider. Corrections and further thoughts on the problem are welcomed.

5 thoughts on “String swap algorithms

  1. Patrys

    I seriously don't think you can consider swapping two strings of equal length an atomic operation.

    If you do, then the single letter swapping method does not need a placeholder step, you just swap the i-th character with any of the remaining [i+1..n] characters if needed. This makes it ~(n - 1) at most (i-th step guarantees i characters in place but there's nothing to swap the n-th character with).

  2. Patrys

    Also, I think it's really a problem of assigning proper weights to each letter and choosing the most efficient in-place sorting algorithm (be it letter- or block-based).

  3. Marius Gedminas

    Jon Bentley's _Programming Pearls_ also discusses this problem (called 'vector rotation' in the book), with graphs of actual running times of the three algorithms.

  4. Benjamin Otte

    You can improve on Method 2 by not keeping an extra character, but using the empty space as the "extra character". If you do this, the approach is of equal complexity, but simpler to implement:
    given 3 variables:
    length = length (str)
    lcd = LCD(length (substr1), length);
    for (i =0; i < lcd; i++):
    .. j = i
    .. k = (j + l1) mod length
    .. while k != i:
    .... swap_chars (j, k)
    .... j = k
    .... k = (j + l1) mod length

    That should give you at least equal performance than method 3 without doing anything fancy. It should also take the minimum amount of swaps necessary. Examples:
    foobar
    boofar - 1 swap
    baofor - 1 swap
    barfoo - 1 swap
    123456foo
    f231564oo - 2 swaps
    fo312645o - 2 swaps
    foo123456 - 2 swaps

  5. Jonathan Turner

    Thanks for the puzzler. Always a good exercise to bang a problem like that around: it's easy to wrap your head around but there's still enough subtlety there to make it interesting.

    After some looking, my cousin pointed out that stl::rotate uses your #2 solution. After a minute of thought, it's pretty obvious why.

    Solution #1 and #3 use swaps, and regardless of how efficient a swap is, I'm pretty sure you always read three times (once for each slot, and once for the temp) and write three times (same). Solution #2 does not swap, but instead reads once and writes once per transfer. Even at O(n) (or n+2) ops, the number of operations is less than O(n/2) swaps.

Comments are closed.