The smallest cc that the bound holds for is c=1√2−1≈2.41c=12√−1≈2.41.
Lemmas 1 and 2 show that the bound holds for this cc.
Lemma 3 shows that this bound is tight.
(In comparison, Juri's elegant probabilistic argument gives c=4c=4.)
Let c=1√2−1c=12√−1.
Lemma 1 gives the upper bound for k=0k=0.
Lemma 1:
If ff is ϵgϵg-near a function gg that has no influencing variables
in S2S2, and ff is ϵhϵh-near a function hh that has no influencing variables
in S1S1, then ff is ϵϵ-near a constant function,
where ϵ≤(ϵg+ϵh)/2cϵ≤(ϵg+ϵh)/2c.
Proof.
Let ϵϵ be the distance from ff to a constant function.
Suppose for contradiction that ϵϵ does not satisfy the claimed inequality.
Let y=(x1,x2,…,xn/2)y=(x1,x2,…,xn/2) and z=(xn/2+1,…,xn)z=(xn/2+1,…,xn)
and write ff, gg, and hh as f(y,z)f(y,z), g(y,z)g(y,z) and h(y,z)h(y,z),
so g(y,z)g(y,z) is independent of zz and h(y,z)h(y,z) is independent of yy.
(I find it helpful to visualize ff as the edge-labeling
of the complete bipartite graph with vertex sets {y}{y} and {z}{z},
where gg gives a vertex-labeling of {y}{y},
and hh gives a vertex-labeling of {z}{z}.)
Let g0g0 be the fraction of pairs (y,z)(y,z) such that g(y,z)=0g(y,z)=0.
Let g1=1−g0g1=1−g0 be the fraction of pairs such that g(y,z)=1g(y,z)=1.
Likewise let h0h0 be the fraction of pairs such that h(y,z)=0h(y,z)=0,
and let h1h1 be the fraction of pairs such that h(y,z)=1h(y,z)=1.
Without loss of generality, assume that, for any pair such that g(y,z)=h(y,z)g(y,z)=h(y,z),
it also holds that f(y,z)=g(y,z)=h(y,z)f(y,z)=g(y,z)=h(y,z). (Otherwise, toggling the value of
f(y,z)f(y,z) allows us to decrease both ϵgϵg and ϵhϵh by 1/2n1/2n,
while decreasing the ϵϵ by at most 1/2n1/2n,
so the resulting function is still a counter-example.) Say any such pair is ``in agreement''.
The distance from ff to gg plus the distance from ff to hh
is the fraction of (x,y)(x,y) pairs that are not in agreement.
That is, ϵg+ϵh=g0h1+g1h0ϵg+ϵh=g0h1+g1h0.
The distance from ff to the all-zero function is at most 1−g0h01−g0h0.
The distance from ff to the all-ones function is at most 1−g1h11−g1h1.
Further, the distance from ff to the nearest constant function is at most 1/21/2.
Thus, the ratio ϵ/(ϵg+ϵh)ϵ/(ϵg+ϵh) is at most
min(1/2,1−g0h0,1−g1h1)g0h1+g1h0,
min(1/2,1−g0h0,1−g1h1)g0h1+g1h0,
where
g0,h0∈[0,1]g0,h0∈[0,1] and
g1=1−g0g1=1−g0 and
h1=1−h0h1=1−h0.
By calculation, this ratio is at most
12(√2−1)=c/212(2√−1)=c/2. QED
Lemma 2 extends Lemma 1 to general kk by arguing pointwise,
over every possible setting of the 2k2k influencing variables.
Recall that c=1√2−1c=12√−1.
Lemma 2: Fix any kk.
If ff is ϵgϵg-near a function gg that has
kk influencing variables in S2S2, and ff is ϵhϵh-near a function hh that
has kk influencing variables in S1S1,
then ff is ϵϵ-near a function ˆff^
that has at most 2k2k influencing variables,
where ϵ≤(ϵg+ϵh)/2cϵ≤(ϵg+ϵh)/2c.
Proof. Express ff as f(a,y,b,z)f(a,y,b,z) where (a,y)(a,y) contains the variables in S1S1
with aa containing those that influence hh, while (b,z)(b,z) contains the
variables in S2S2 with bb containing those influencing gg.
So g(a,y,b,z)g(a,y,b,z) is independent of zz,
and h(a,y,b,z)h(a,y,b,z) is independent of yy.
For each fixed value of aa and bb, define Fab(y,z)=f(a,y,b,z)Fab(y,z)=f(a,y,b,z),
and define GabGab and HabHab similarly from gg and hh respectively.
Let ϵgabϵgab be the distance from FabFab to GabGab
(restricted to (y,z)(y,z) pairs).
Likewise let ϵhabϵhab be the distance from FabFab to HabHab.
By Lemma 1, there exists a constant cabcab such that
the distance (call it ϵabϵab)
from FabFab to the constant function cabcab
is at most (ϵhab+ϵgab)/(2c)(ϵhab+ϵgab)/(2c).
Define ˆf(a,y,b,z)=cabf^(a,y,b,z)=cab.
Clearly ˆff^ depends only on
aa and bb (and thus at most kk variables).
Let ϵˆfϵf^ be the average, over the (a,b)(a,b) pairs,
of the ϵabϵab's, so that
the distance from ff to ˆff^ is ϵˆfϵf^.
Likewise, the distances from ff to gg and from ff to hh
(that is, ϵgϵg and ϵh)ϵh) are the averages,
over the (a,b)(a,b) pairs, of, respectively, ϵgabϵgab and ϵhabϵhab.
Since ϵab≤(ϵhab+ϵgab)/(2c)ϵab≤(ϵhab+ϵgab)/(2c)
for all a,ba,b, it follows that
ϵˆf≤(ϵg+ϵh)/(2c)ϵf^≤(ϵg+ϵh)/(2c). QED
Lemma 3 shows that the constant cc above is the best you can hope
for (even for k=0k=0 and ϵ=0.5ϵ=0.5).
Lemma 3: There exists ff such that ff is (0.5/c)(0.5/c)-near two functions gg and hh,
where gg has no influencing variables in S2S2
and hh has no influencing variables in S1S1,
and ff is 0.50.5-far from every constant function.
Proof.
Let yy and zz be xx restricted to, respectively, S1S1 and S2S2.
That is, y=(x1,…,xn/2)y=(x1,…,xn/2) and z=(xn/2+1,…,xn)z=(xn/2+1,…,xn).
Identify each possible yy with a unique element of [N][N],
where N=2n/2N=2n/2.
Likewise, identify each possible zz with a unique element of [N][N].
Thus, we think of ff as a function from [N]×[N][N]×[N] to {0,1}{0,1}.
Define f(y,z)f(y,z) to be 1 iff max(y,z)≥1√2Nmax(y,z)≥12√N.
By calculation, the fraction of ff's values that are zero
is (1√2)2=12(12√)2=12,
so both constant functions have distance 1212 to f.
Define g(y,z) to be 1 iff y≥1√2N.
Then g has no influencing variables in S2.
The distance from f to g is the fraction of pairs (y,z)
such that y<1√2N and z≥1√2N.
By calculation, this is at most 1√2(1−1√2)=0.5/c
Similarly, the distance from f to h, where h(y,z)=1
iff z≥1√2N, is at most 0.5/c.
QED