2012-10-21 06:03:44 8 Comments
What is the best way to represent and solve a maze given an image?
Given an JPEG image (as seen above), what's the best way to read it in, parse it into some data structure and solve the maze? My first instinct is to read the image in pixel by pixel and store it in a list (array) of boolean values: True
for a white pixel, and False
for a non-white pixel (the colours can be discarded). The issue with this method, is that the image may not be "pixel perfect". By that I simply mean that if there is a white pixel somewhere on a wall it may create an unintended path.
Another method (which came to me after a bit of thought) is to convert the image to an SVG file - which is a list of paths drawn on a canvas. This way, the paths could be read into the same sort of list (boolean values) where True
indicates a path or wall, False
indicating a travel-able space. An issue with this method arises if the conversion is not 100% accurate, and does not fully connect all of the walls, creating gaps.
Also an issue with converting to SVG is that the lines are not "perfectly" straight. This results in the paths being cubic bezier curves. With a list (array) of boolean values indexed by integers, the curves would not transfer easily, and all the points that line on the curve would have to be calculated, but won't exactly match to list indices.
I assume that while one of these methods may work (though probably not) that they are woefully inefficient given such a large image, and that there exists a better way. How is this best (most efficiently and/or with the least complexity) done? Is there even a best way?
Then comes the solving of the maze. If I use either of the first two methods, I will essentially end up with a matrix. According to this answer, a good way to represent a maze is using a tree, and a good way to solve it is using the A* algorithm. How would one create a tree from the image? Any ideas?
TL;DR
Best way to parse? Into what data structure? How would said structure help/hinder solving?
UPDATE
I've tried my hand at implementing what @Mikhail has written in Python, using numpy
, as @Thomas recommended. I feel that the algorithm is correct, but it's not working as hoped. (Code below.) The PNG library is PyPNG.
import png, numpy, Queue, operator, itertools
def is_white(coord, image):
""" Returns whether (x, y) is approx. a white pixel."""
a = True
for i in xrange(3):
if not a: break
a = image[coord[1]][coord[0] * 3 + i] > 240
return a
def bfs(s, e, i, visited):
""" Perform a breadth-first search. """
frontier = Queue.Queue()
while s != e:
for d in [(-1, 0), (0, -1), (1, 0), (0, 1)]:
np = tuple(map(operator.add, s, d))
if is_white(np, i) and np not in visited:
frontier.put(np)
visited.append(s)
s = frontier.get()
return visited
def main():
r = png.Reader(filename = "thescope-134.png")
rows, cols, pixels, meta = r.asDirect()
assert meta['planes'] == 3 # ensure the file is RGB
image2d = numpy.vstack(itertools.imap(numpy.uint8, pixels))
start, end = (402, 985), (398, 27)
print bfs(start, end, image2d, [])
Related Questions
Sponsored Content
29 Answered Questions
3 Answered Questions
[SOLVED] Maze Image Manipulation, Trimming whitespace
- 2015-01-19 09:09:20
- Koborl
- 112 View
- 0 Score
- 3 Answer
- Tags: java algorithm image-manipulation maze
3 Answered Questions
[SOLVED] How to read a maze from an image and convert it to binary values in Python
- 2019-08-22 13:24:19
- user11676515
- 535 View
- 1 Score
- 3 Answer
- Tags: python image-processing maze
44 Answered Questions
16 Answered Questions
[SOLVED] Check if a given key already exists in a dictionary
- 2009-10-21 19:05:09
- Mohan Gulati
- 3227913 View
- 2683 Score
- 16 Answer
- Tags: python dictionary
24 Answered Questions
[SOLVED] Image Processing: Algorithm Improvement for 'Coca-Cola Can' Recognition
- 2012-04-16 04:23:16
- Charles Menguy
- 179485 View
- 1587 Score
- 24 Answer
- Tags: c++ algorithm image-processing opencv
43 Answered Questions
[SOLVED] How can I represent an 'Enum' in Python?
- 2008-08-31 15:55:47
- sectrean
- 803415 View
- 1146 Score
- 43 Answer
- Tags: python python-3.x enums
2 Answered Questions
[SOLVED] Improvement of performance for a maze solving program in Python
- 2018-06-29 11:07:01
- Alpha
- 492 View
- 1 Score
- 2 Answer
- Tags: python algorithm python-imaging-library a-star maze
9 comments
@Brian D 2018-10-04 00:55:31
Here's a solution using R.
RGB to greyscale, see: https://stackoverflow.com/a/27491947/2371031
Voila!
This is what happens if you don't fill in some border pixels (Ha!)...
Full disclosure: I asked and answered a very similar question myself before I found this one. Then through the magic of SO, found this one as one of the top "Related Questions". I thought I'd use this maze as an additional test case... I was very pleased to find that my answer there also works for this application with very little modification.
@moooeeeep 2013-05-20 19:33:31
I tried myself implementing A-Star search for this problem. Followed closely the implementation by Joseph Kern for the framework and the algorithm pseudocode given here:
As A-Star is a heuristic search algorithm you need to come up with a function that estimates the remaining cost (here: distance) until the goal is reached. Unless you're comfortable with a suboptimal solution it should not overestimate the cost. A conservative choice would here be the manhattan (or taxicab) distance as this represents the straight-line distance between two points on the grid for the used Von Neumann neighborhood. (Which, in this case, wouldn't ever overestimate the cost.)
This would however significantly underestimate the actual cost for the given maze at hand. Therefore I've added two other distance metrics squared euclidean distance and the manhattan distance multiplied by four for comparison. These however might overestimate the actual cost, and might therefore yield suboptimal results.
Here's the code:
Here are some images for a visualization of the results (inspired by the one posted by Joseph Kern). The animations show a new frame each after 10000 iterations of the main while-loop.
Breadth-First Search:
A-Star Manhattan Distance:
A-Star Squared Euclidean Distance:
A-Star Manhattan Distance multiplied by four:
The results show that the explored regions of the maze differ considerably for the heuristics being used. As such, squared euclidean distance even produces a different (suboptimal) path as the other metrics.
Concerning the performance of the A-Star algorithm in terms of the runtime until termination, note that a lot of evaluation of distance and cost functions add up compared to the Breadth-First Search (BFS) which only needs to evaluate the "goaliness" of each candidate position. Whether or not the cost for these additional function evaluations (A-Star) outweighs the cost for the larger number of nodes to check (BFS) and especially whether or not performance is an issue for your application at all, is a matter of individual perception and can of course not be generally answered.
A thing that can be said in general about whether or not an informed search algorithm (such as A-Star) could be the better choice compared to an exhaustive search (e.g., BFS) is the following. With the number of dimensions of the maze, i.e., the branching factor of the search tree, the disadvantage of an exhaustive search (to search exhaustively) grows exponentially. With growing complexity it becomes less and less feasible to do so and at some point you are pretty much happy with any result path, be it (approximately) optimal or not.
@example 2014-05-15 16:23:28
"A-Star Manhattan Distance multiplied by four"? A-Star is not A-Star if the heuristic can overestimate the distance. (And thus does not guarantee to find a shortest path either)
@moooeeeep 2014-05-22 07:43:26
@example Of course, if one applies a non-admissible heuristic function the algorithm might fail to find the optimal solution (like I pointed out in my answer). But I wouldn't go so far as to rename the basic algorithm for that reason.
@Mikhail 2012-10-21 07:20:56
Here is a solution.
Here is the MATLAB code for BFS:
It is really very simple and standard, there should not be difficulties on implementing this in Python or whatever.
And here is the answer:
@Mikhail 2012-10-21 20:33:28
@Whymarrh Well, for "Just this image" you now actually have an answer. Do you have any specific questions? Items 1-4 from my list are the manual processing I was asking about. Item 5 is a BFS - the very basic algorithm for graphs, but it can be applied to image directly, without converting pixels to vertices and neighbors to edges.
@Whymarrh 2012-10-21 21:13:52
I feel that you've covered everything. I'm trying my hand at implementing what you've said in Python (using DFS in place of BFS, only because I've coded that once before). I'll be back to update the question/accept answers in a bit.
@Mikhail 2012-10-22 09:06:20
@Whymarrh DFS will not find you the shortest way, while BFS will. They are inherently the same, the only difference is the underlying structure. Stack (FILO) for DFS and queue (FIFO) for BFS.
@Whymarrh 2012-10-23 00:45:43
I've updated the question to include code, changed from DFS to BFS. I mustn't have written it properly.
@Pavan Yalamanchili 2012-10-23 22:55:00
@Whymarrh do you need the shortest path ? If not DFS will probably be much faster. I ask because you don't mention it in the original question.
@Whymarrh 2012-10-24 00:02:24
@Pavan not the shortest path in particular, any path will do.
@Mikhail 2012-10-24 08:50:21
@Pavan I don't agree DFS will be faster than BFS, since, as I've mentioned, their only difference is in underlying data structure. You can terminate both when the finish is reached. And it depends on a maze, who will reach the finish faster.
@j_random_hacker 2012-10-24 11:51:06
BFS is the right choice here, because it produces a shortest path, which gives a "sensible" path even when the corridors are much wider than 1 pixel. OTOH DFS will tend to explore corridors and unpromising maze regions with a "flood fill" pattern.
@Pavan Yalamanchili 2012-10-24 15:57:50
@j_random_hacker That makes more sense now. I was not thinking of corridor width > 1pixel.
@Joseph Kern 2012-10-30 18:52:44
Can you post the unpathed version of the image. I am trying to duplicate your results, but my image has many isolated areas of the maze.
@Mikhail 2012-10-30 19:22:32
@JosephKern Path doesn't overlap any walls. Just remove all the red pixels and here you go.
@Joseph Kern 2012-11-01 09:40:23
This solution is written in Python. Thanks Mikhail for the pointers on the image preparation.
An animated Breadth-First Search:
The Completed Maze:
Note: Marks a white visited pixel grey. This removes the need for a visited list, but this requires a second load of the image file from disk before drawing a path (if you don't want a composite image of the final path and ALL paths taken).
A blank version of the maze I used.
@Joseph Kern 2012-11-03 10:30:30
Because you were awesome enough to come back and upvote me even after your question had been answered, I created an animated gif of the BFS, to help better visualize the process.
@math 2012-12-17 07:52:17
+1 This solution is very compact and uses free tools.
@stefano 2013-12-02 00:15:51
Nice one, thanks. For others who wish to play around with this, as I did, I'd like to share my tips based on difficulties I faced. 1) Either convert the image to pure black & white or modify your 'isWhite()' function to accept near-white|black. I wrote a 'cleanImage' method which preprocessed all pixels converting them to either pure white or black, otherwise the algorithm fails to find a path. 2) Read the image in explicitly as RGB [ base_img = Image.open(img_in); base_img = base_img.convert('RGB') ]. To get a gif, output several images and then run 'convert -delay 5 -loop 1 *.jpg bfs.gif'.
@sloewen 2017-01-01 11:10:08
missing indent in line 13
@stefano 2013-12-10 05:51:13
Here you go: maze-solver-python (GitHub)
I had fun playing around with this and extended on Joseph Kern's answer. Not to detract from it; I just made some minor additions for anyone else who may be interested in playing around with this.
It's a python-based solver which uses BFS to find the shortest path. My main additions, at the time, are:
As it stands, the start/end-points are hard-coded for this sample maze, but I plan on extending it such that you can pick the appropriate pixels.
@HolgT 2014-03-10 11:10:16
Great, thanks, it did not run on BSD/Darwin/Mac, some dependencies and the shell script needed minor changes, for those who want to try on Mac: [maze-solver-python]: github.com/holg/maze-solver-python
@stefano 2014-06-13 05:26:07
@HolgT: Glad you found it useful. I welcome any pull requests for this. :)
@lino 2012-10-29 16:23:01
Here are some ideas.
(1. Image Processing:)
1.1 Load the image as RGB pixel map. In C# it is trivial using
system.drawing.bitmap
. In languages with no simple support for imaging, just convert the image to portable pixmap format (PPM) (a Unix text representation, produces large files) or some simple binary file format you can easily read, such as BMP or TGA. ImageMagick in Unix or IrfanView in Windows.1.2 You may, as mentioned earlier, simplify the data by taking the (R+G+B)/3 for each pixel as an indicator of gray tone and then threshold the value to produce a black and white table. Something close to 200 assuming 0=black and 255=white will take out the JPEG artifacts.
(2. Solutions:)
2.1 Depth-First Search: Init an empty stack with starting location, collect available follow-up moves, pick one at random and push onto the stack, proceed until end is reached or a deadend. On deadend backtrack by popping the stack, you need to keep track of which positions were visited on the map so when you collect available moves you never take the same path twice. Very interesting to animate.
2.2 Breadth-First Search: Mentioned before, similar as above but only using queues. Also interesting to animate. This works like flood-fill in image editing software. I think you may be able to solve a maze in Photoshop using this trick.
2.3 Wall Follower: Geometrically speaking, a maze is a folded/convoluted tube. If you keep your hand on the wall you will eventually find the exit ;) This does not always work. There are certain assumption re: perfect mazes, etc., for instance, certain mazes contain islands. Do look it up; it is fascinating.
(3. Comments:)
This is the tricky one. It is easy to solve mazes if represented in some simple array formal with each element being a cell type with north, east, south and west walls and a visited flag field. However given that you are trying to do this given a hand drawn sketch it becomes messy. I honestly think that trying to rationalize the sketch will drive you nuts. This is akin to computer vision problems which are fairly involved. Perhaps going directly onto the image map may be easier yet more wasteful.
@Jim Gray 2012-10-24 08:33:15
Tree search is too much. The maze is inherently separable along the solution path(s).
(Thanks to rainman002 from Reddit for pointing this out to me.)
Because of this, you can quickly use connected components to identify the connected sections of maze wall. This iterates over the pixels twice.
If you want to turn that into a nice diagram of the solution path(s), you can then use binary operations with structuring elements to fill in the "dead end" pathways for each connected region.
Demo code for MATLAB follows. It could use tweaking to clean up the result better, make it more generalizable, and make it run faster. (Sometime when it's not 2:30 AM.)
@kylefinn 2012-10-24 02:41:16
Uses a queue for a threshold continuous fill. Pushes the pixel left of the entrance onto the queue and then starts the loop. If a queued pixel is dark enough, it's colored light gray (above threshold), and all the neighbors are pushed onto the queue.
Solution is the corridor between gray wall and colored wall. Note this maze has multiple solutions. Also, this merely appears to work.
@zessx 2013-08-30 15:16:44
Interesting naive resolution, based on the hand-on-wall method. Indeed, not the best one, but I like it.
@Thomas 2012-10-21 07:17:42
I'd go for the matrix-of-bools option. If you find that standard Python lists are too inefficient for this, you could use a
numpy.bool
array instead. Storage for a 1000x1000 pixel maze is then just 1 MB.Don't bother with creating any tree or graph data structures. That's just a way of thinking about it, but not necessarily a good way to represent it in memory; a boolean matrix is both easier to code and more efficient.
Then use the A* algorithm to solve it. For the distance heuristic, use the Manhattan distance (
distance_x + distance_y
).Represent nodes by a tuple of
(row, column)
coordinates. Whenever the algorithm (Wikipedia pseudocode) calls for "neighbours", it's a simple matter of looping over the four possible neighbours (mind the edges of the image!).If you find that it's still too slow, you could try downscaling the image before you load it. Be careful not to lose any narrow paths in the process.
Maybe it's possible to do a 1:2 downscaling in Python as well, checking that you don't actually lose any possible paths. An interesting option, but it needs a bit more thought.
@Boris Gorelik 2012-10-21 08:05:27
This excellent blog post shows how to solve a maze in mathematica. Translating the method to python shouldn't be a problem
@Whymarrh 2012-10-23 00:59:29
I've updated the question. If I choose to use RGB triples in lieu of
boolean
values, would the storage still compare? The matrix is then 2400 * 1200. And would A* over BFS have a significant impact on real running time?@Brian Cain 2012-11-02 19:14:42
@Whymarrh, the bit depth can shrink to compensate. 2 bits per pixel should be enough for anybody.