Wikipedia:Reference desk/Archives/Mathematics/2020 April 6
Mathematics desk | ||
---|---|---|
< April 5 | << Mar | April | May >> | April 7 > |
Welcome to the Wikipedia Mathematics Reference Desk Archives |
---|
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages. |
April 6
[edit]Matrix
[edit]I have some states, and some weighted transitions. These form a (relatively sparse) matrix in the obvious way.
I'd like to swap rows and columns until the matrix is as "diagonalised" as possible. (Not in the usual sense that the off-diagonals are zero.)
I'm not certain quite what I mean by this, but larger numbers should be on or near the diagonal, and zeros should be as far away as possible.
Is there a nice way of stating this problem, and is there a nice solution?
All the best: Rich Farmbrough (the apparently calm and reasonable) 13:34, 6 April 2020 (UTC).
- If it is a transition matrix, then items on the diagonal represent a transition of a state to itself. They correspond to loops in the directed graph of state transitions. If you swap rows and columns without constraint, you destroy this property. Are you sure you are prepared to do that? --Lambiam 14:07, 6 April 2020 (UTC)
- I thought I was... but now I am wondering if this is so nice to have that I want to keep it. In this case the leading diagonal will be zero. Maybe I am more interested in the diagonal above the leading diagonal. I think that does help with my thinking. Thank you. All the best: Rich Farmbrough (the apparently calm and reasonable) 14:55, 6 April 2020 (UTC).
- This sounds very similar to the Longest path problem; once you have the longest path then reindex to follow the path. There are (of course) many variations on the idea of this problem but according to the article the basic version is NP-hard; not good news if you're looking for a fast algorithm to solve it. But it looks like you've got additional conditions, for example since it's a transition matrix presumably it's stochastic, so it's hard to tell for sure that NP-hardness applies here. --RDBury (talk) 18:25, 6 April 2020 (UTC)
- It's relatively small, 18 states, and only 88 transitions in the data set, so it's probably computationally feasible. Thanks for the pointers. All the best: Rich Farmbrough (the apparently calm and reasonable) 08:43, 7 April 2020 (UTC).
- It may not seem much, but 18 states can be ordered in 6,402,373,705,728,000 ways, so the problem of finding the optimal arrangement may well still be computationally infeasible. --Lambiam 14:58, 7 April 2020 (UTC)
- It's relatively small, 18 states, and only 88 transitions in the data set, so it's probably computationally feasible. Thanks for the pointers. All the best: Rich Farmbrough (the apparently calm and reasonable) 08:43, 7 April 2020 (UTC).
- This might be an instance of the XY problem. Why do you want to swap rows and columns around to make the matrix "more diagonal"? –Deacon Vorbis (carbon • videos) 18:28, 6 April 2020 (UTC)
- It would be nice to show the "natural" flow of the process down the diagonal. If you imagine ticks of time, then you could say the number of loops, rather than being zero, tends to infinity as the tick length tends to zero, so this is intuitively nicer than having a discontinuity on the leading diagonal. All the best: Rich Farmbrough (the apparently calm and reasonable) 08:43, 7 April 2020 (UTC).
- You want the assignment of indices to states to be such that for pairs of states that have a high transition probability, these indices are mostly close together (presumably preferably while , so assuming a right stochastic matrix we should be aiming for upper co-diagonals). Is that correct? In that case the problem is probably best thought of as finding an optimal total ordering given certain preferences, vaguely reminiscent of topological sorting. --Lambiam 12:48, 7 April 2020 (UTC)
- I now see the similarity to the Longest path problem mentioned by RDBury above. There may be well-trodden path segments with high transition probabilities, and a heuristic could be to find and cluster such segments, resulting in a greedy algorithm similar to Kruskal's algorithm for constructing a minimum spanning tree. I further suspect some of the ideas in algorithms for genome sequencing – of which I know nothing – may be of interest. --Lambiam 13:29, 7 April 2020 (UTC)
- It would be nice to show the "natural" flow of the process down the diagonal. If you imagine ticks of time, then you could say the number of loops, rather than being zero, tends to infinity as the tick length tends to zero, so this is intuitively nicer than having a discontinuity on the leading diagonal. All the best: Rich Farmbrough (the apparently calm and reasonable) 08:43, 7 April 2020 (UTC).
- This sounds very similar to the Longest path problem; once you have the longest path then reindex to follow the path. There are (of course) many variations on the idea of this problem but according to the article the basic version is NP-hard; not good news if you're looking for a fast algorithm to solve it. But it looks like you've got additional conditions, for example since it's a transition matrix presumably it's stochastic, so it's hard to tell for sure that NP-hardness applies here. --RDBury (talk) 18:25, 6 April 2020 (UTC)
- I thought I was... but now I am wondering if this is so nice to have that I want to keep it. In this case the leading diagonal will be zero. Maybe I am more interested in the diagonal above the leading diagonal. I think that does help with my thinking. Thank you. All the best: Rich Farmbrough (the apparently calm and reasonable) 14:55, 6 April 2020 (UTC).
- If you take the eigenvectors of your adjacency matrix, and rank them by the magnitude of their respective eigenvalues, the top few eigenvectors give you "topological coordinates" for the vertices. If you append the eigenvector to the matrix as an extra row, and sort the columns according to that row, you should get a more natural ordering. — I learned about this from An Atlas of Fullerenes, which apparently used this technique for its illustrations. I've also read that eigenvectors are used to rank college teams that never met (but are connected by a chain of pairs of teams that did meet); one of the eigenvectors has a unique property (all components positive, or some such) making it easily recognized as the one to use for this purpose. — When I experimented with topological coordinates for graphs, N years ago, I found (if memory serves) that the "weightiest" eigenvector is not independent: including it as one of three coordinate sets, for example, gives a flat figure. So try the next one. —Tamfang (talk) 05:08, 10 April 2020 (UTC)
This does sound like an XY problem. Can you describe the actual application? What are these states and weights, and are there loops in the graph? It sounds like maybe you want to find the most likely path through a Markov chain, which might involve some linear algebra. As for just permuting rows and columns, you can probably do ok with a few simple heuristics, and 18 elements isn't a lot, despite NP-hardness. NP-hardness only means the worst case instances can be hard to solve, but those instances can be very rare, and SAT solvers rely on that. They routinely solve problems with 1000s of variables. [1] is a good tutorial. 73.93.141.45 (talk) 09:22, 9 April 2020 (UTC)