Efficiently enumerating hitting sets of hypergraphs arising in data profiling. (English) Zbl 1430.68178
Kobourov, Stephen (ed.) et al., Proceedings of the 21st workshop on algorithm engineering and experiments, ALENEX ’19, San Diego, CA, USA, January 7–8, 2019. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM). 130-143 (2019).
Summary: We devise an enumeration method for inclusion-wise minimal hitting sets in hypergraphs. It has delay \(O(m^{k^\ast +1} \cdot n^2)\) and uses linear space. Hereby, \(n\) is the number of vertices, \(m\) the number of hyperedges, and \(k^\ast\) the rank of the transversal hypergraph. In particular, on classes of hypergraphs for which the cardinality \(k^\ast\) of the largest minimal hitting set is bounded, the delay is polynomial. The algorithm solves the extension problem for minimal hitting sets as a subroutine. We show that the extension problem is W[3]-complete when parameterised by the cardinality of the set which is to be extended. For the subroutine, we give an algorithm that is optimal under the exponential time hypothesis. Despite these lower bounds, we provide empirical evidence showing that the enumeration outperforms the theoretical worst-case guarantee on hypergraphs arising in the profiling of relational databases, namely, in the detection of unique column combinations.
For the entire collection see [Zbl 1409.68020].
For the entire collection see [Zbl 1409.68020].
MSC:
68R10 | Graph theory (including graph drawing) in computer science |
05C30 | Enumeration in graph theory |
05C65 | Hypergraphs |
68P15 | Database theory |
68Q17 | Computational difficulty of problems (lower bounds, completeness, difficulty of approximation, etc.) |
68W40 | Analysis of algorithms |