Ergodic-inspired methods
[Cleanup needed.]
Idea #1: extreme localisation
Let [math]\displaystyle{ A \subset [3]^n }[/math] be line-free with density [math]\displaystyle{ \delta }[/math]. Let [math]\displaystyle{ m = m(\delta) }[/math] be a medium size integer independent of n. We embed [math]\displaystyle{ [3]^m }[/math] inside [math]\displaystyle{ [3]^n }[/math] to create a random set [math]\displaystyle{ A_m \subset [3]^m }[/math] which enjoys stationarity properties. We then look at the events [math]\displaystyle{ E_{i,j} }[/math] for [math]\displaystyle{ 1 \leq i \leq j \leq m }[/math], which are the event that [math]\displaystyle{ 1^i 0^{j-i} 2^{m-j} }[/math] lies in [math]\displaystyle{ A_m }[/math]. As A is line-free, we observe that [math]\displaystyle{ E_{i,i}, E_{i,j}, E_{j,j} }[/math] cannot simultaneously occur for any [math]\displaystyle{ 1 \leq i \lt j \leq m }[/math]. Also, each of the [math]\displaystyle{ E_{i,j} }[/math] have probability about [math]\displaystyle{ \delta }[/math].
On the other hand, by the first moment method, many of the [math]\displaystyle{ E_{i,i} }[/math] hold with positive probability. Some Cauchy-Schwarz then tells us that there exists [math]\displaystyle{ 1 \leq i \lt i' \lt j \lt j' \leq n }[/math] such that [math]\displaystyle{ E_{i,j} \wedge E_{i',j} \wedge E_{i,j'} \wedge E_{i',j'} }[/math] has probability significantly larger than [math]\displaystyle{ \delta^4 }[/math].
One can view the events [math]\displaystyle{ E_{i,j} }[/math] as an i+m-j-uniform hypergraph, by fixing a base point x and viewing the random subspace [math]\displaystyle{ [3]^m }[/math] as formed by modifying x on m random indices. The above correlation would mean some significant irregularity in this hypergraph; the hope is that this implies some sort of usable structure on A that can be used, for instance to locate a density increment.
Idea #2: IP Roth first
McCutcheon.508: I will give my general idea for a proof. I’m pretty sure it’s sound, though it may not be feasible in practice. On the other hand I may be badly mistaken about something. I will throw it out there for someone else to attempt, or say why it’s nonsense, or perhaps ignore. I won’t formulate it as a strategy to prove DHJ, but of what I’ve called IP Roth. If successful, one could possibly adapt it to the DHJ, k=3 situation, but there would be complications that would obscure what was going on.
We work in [math]\displaystyle{ X=[n]^{[n]}\times [n]^{[n]}. }[/math] For a real valued function [math]\displaystyle{ f }[/math] defined on [math]\displaystyle{ X }[/math], define the first coordinate 2-norm by [math]\displaystyle{ ||f||^1_2=(\iplim_b\iplim_a {1\over |X|}\sum_{(x,y)\in X} f((x,y))f((x+a,y))f((x+b,y))f((x+a+b,y)))^{1/4}. }[/math] The second coordinate 2-norm is defined similarly (on the second coordinate, obviously). Now, let me explain what this means. [math]\displaystyle{ a }[/math] and [math]\displaystyle{ b }[/math] are subsets of [math]\displaystyle{ [n] }[/math], and we identify [math]\displaystyle{ a }[/math] with the characteristic function of [math]\displaystyle{ a }[/math], which is a member of [math]\displaystyle{ [n]^{[n]}. }[/math] (That is how we can add [math]\displaystyle{ a }[/math] to [math]\displaystyle{ x }[/math] inside, etc. Since [math]\displaystyle{ [n] }[/math] is a finite set, you can’t really take limits, but if [math]\displaystyle{ n }[/math] is large, we can do something almost as good, namely ensure that whenever [math]\displaystyle{ \max\alpha\lt \min\beta }[/math], the expression we are taking the limit of is close to something (Milliken Taylor ensures this, I think). Of course, you have to restrict [math]\displaystyle{ a }[/math] and [math]\displaystyle{ b }[/math] to a subspace. What is a subspace? You take a sequence [math]\displaystyle{ a_i }[/math] of subsets of [math]\displaystyle{ [n] }[/math] with [math]\displaystyle{ \max a_i\lt \min a_{i+1} }[/math] and then restrict to unions of the [math]\displaystyle{ a_i. }[/math]
Now here is the idea. Take a subset [math]\displaystyle{ E }[/math] of [math]\displaystyle{ X }[/math] and let [math]\displaystyle{ f }[/math] be its balanced indicator function. You first want to show that if either of the coordinate 2-norms of [math]\displaystyle{ f }[/math] is small, then [math]\displaystyle{ E }[/math] contains about the right number of corners [math]\displaystyle{ \{ (x,y), (x+a,y), (x,y+a)\}. }[/math] Restricted to the subspace of course. What does that mean? Well, you treat each of the [math]\displaystyle{ a_i }[/math] as a single coordinate, moving them together. The other coordinates I’m not sure about. Maybe you can just fix them in the right way and have the norm that was small summing over all of [math]\displaystyle{ X }[/math] still come out small. At any rate, the real trick is to show that if both coordinate 2-norms are big, you get a density increment on a subspace. Here a subspace surely means that you find some [math]\displaystyle{ a_i }[/math]s, treat them as single coordinates, and fix the values on the other coordinates.