How to convert list of integers into dynamic regex pattern which exactly matches those list of integers?

Body:
I'm trying to dynamically generate a compact regular expression in Ruby based on a list of integers.
For example, given this input list:
[1,2,3,4,5,6,7,9,10,11,12,13,14,15,17,18,19,20,21,22,23,25,26,27,28,29,30,31]
I would like to produce a regex like:
/^([1-7]|9|1[0-5]|1[7-9]|2[0-3]|2[5-9]|3[0-1])$/
This regex should match exactly the numbers in the list, in the most compact form possible — for example:
1-7 gets compacted into [1-7]
10-15 into 1[0-5]
17-19 into 1[7-9]
and so on.
What I have tried:
I wrote the following Ruby code:
def number_list_to_ranges(numbers)
numbers.sort!
ranges = []
start = numbers.first
prev = numbers.first
numbers[1..].each do |n|
if n == prev + 1
prev = n
else
ranges << (start..prev)
start = n
prev = n
end
end
ranges << (start..prev)
end
def range_to_regex(r)
return r.begin.to_s if r.begin == r.end
if r.begin >= 0 && r.end <= 9
"[#{r.begin}-#{r.end}]"
elsif r.begin >= 10 && r.end <= 99
subranges = []
(r.begin..r.end).each do |n|
subranges << n
end
grouped = number_list_to_ranges(subranges)
grouped.map do |subr|
if subr.begin == subr.end
subr.begin.to_s
elsif subr.begin / 10 == subr.end / 10
tens = subr.begin / 10
units_start = subr.begin % 10
units_end = subr.end % 10
"#{tens}[#{units_start}-#{units_end}]"
else
subr.map(&:to_s).join('|')
end
end.join('|')
else
(r.begin..r.end).map(&:to_s).join('|')
end
end
def generate_regex(numbers)
ranges = number_list_to_ranges(numbers.uniq)
parts = ranges.map { |r| range_to_regex(r) }
"/^(" + parts.join('|') + ")$/"
end
nums = [1,2,3,4,5,6,7,9,10,11,12,13,14,15,17,18,19,20,21,22,23,25,26,27,28,29,30,31]
puts generate_regex(nums)
Problem:
This code does not correctly compact the list into the desired compact regex form. Instead, it just prints a verbose list like:
/^(1|2|3|4|5|6|7|9|10|11|12|13|14|15|17|18|19|20|21|22|23|25|26|27|28|29|30|31)$/
It doesn't group them into [1-7], 1[0-5], 1[7-9], etc.
Question:
How can I modify or improve this Ruby code to properly generate a compact regex from a list of integers?
Preferably:
Group continuous ranges into [start-end]
Handle tens nicely (e.g., 10-15 → 1[0-5])
Keep it readable and efficient
Any suggestions or better approaches?
Answer
Okay i made that work with custom logic. Following code is working perfectly considering following:
This approach works well for numbers up to 99.
For numbers >99, it would need extension (e.g., 100–199 logic).
You could further optimize to collapse cross-tens boundaries if needed (advanced).
def generate_regex_parts(x)
length = x.length
start = x[0]
ending = x[-1]
optimized = []
(0...length).each do |n|
q = x[n] / 10
r = x[n] % 10
if start == ending
optimized << "#{x[n]}"
break
end
if q == 0
if (x[n+1] - x[n] ) != 1 || (x[n + 1] / 10 ) != q
optimized << "[#{start % 10}-#{r}]"
start = x[n + 1]
end
else
if (x[n+1] - x[n] ) != 1 || (x[n + 1] / 10 ) != q
optimized << "#{q}[#{start % 10}-#{r}]"
start = x[n + 1]
elsif (x[n + 1] ) == ending && (x[n + 1] / 10 ) == q
optimized << "#{q}[#{start % 10}-#{x[n + 1] % 10}]"
break
end
end
return optimized
end
Enjoyed this article?
Check out more content on our blog or follow us on social media.
Browse more articles