App-phoebe

 view release on metacpan or  search on metacpan

Changes  view on Meta::CPAN

33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
- App::Phoebe::Spartan is new; an alternative to Gemini and Titan
  since it's very simply, suggests gemtext for text formatting, allows
  uploads, but doesn't do TLS
 
- App::Phoebe::Iapetus is new; an alternative to the Titan protocol
  for uploading
 
- App::Phoebe::SpeedBump loads the block list on startup; as it saves
  every half an hour, that means you no longer have to worry too much
  about losing information on the blocked IP numbers and networks
 
- App::Gopher is new and improved; it no longer just prints the Gemini
  text but does line wrapping and all that
 
3.00
 
- Add special feeds for the blog, i.e. the pages starting with an ISO
  date
 
2.08

README.md  view on Meta::CPAN

1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
We want to block crawlers that are too fast or that don’t follow the
instructions in robots.txt. We do this by keeping a list of recent visitors: for
every IP number, we remember the timestamps of their last visits. If they make
more than 30 requests in 60s, we block them for an ever increasing amount of
seconds, starting with 60s and doubling every time this happens.
 
For every IP number, Phoebe also records whether the last 30 requests were
“suspicious” or not. A suspicious request is a request that is “disallowed” for
bots according to “robots.txt” (more or less). If 10 requests or more of the
last 30 requests in the last 60 seconds are suspicious, the IP number is
blocked.
 
When an IP number is blocked, it is blocked for 60s, and there’s a 120s
probation time. When you’re blocked, Phoebe responds with a “44” response. This
means: slow down!
 
If the IP number is unblocked but gives cause for another block in the probation
time, it is blocked again and the blocking time is doubled: the IP is blocked
for 120s and there’s 240s probation time. And if it happens again, it is doubled
again.
 
There is no configuration required, but adding a known fingerprint is suggested.
The `/do/speed-bump` URL shows you more information, if you have a client
certificate with a known fingerprint.
 
The exact number of requests and the length of the time window (in seconds) can
be changed in the `config` file, too.

lib/App/Phoebe.pm  view on Meta::CPAN

328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
# We can't use C<flock> because this defaults to C<fcntl> which means they are
# I<per process>
sub with_lock {
  my $stream = shift;
  my $host = shift;
  my $space = shift;
  my $code = shift;
  my $count = shift || 0;
  my $dir = wiki_dir($host, $space);
  my $lock = "$dir/locked";
  # remove stale locks
  if (-e $lock) {
    my $age = time() - modified($lock);
    $log->debug("lock is ${age}s old");
    rmdir $lock if -e $lock and $age > 5;
  }
  if (mkdir($lock)) {
    $log->debug("Running code with lock $lock");
    eval { $code->() }; # protect against exceptions
    if ($@) {
      $log->error("Unable to run code with locked $lock: $@");
      result($stream, "40", "An error occured, unfortunately");
    }
    rmdir($lock);
    $stream->close_gracefully();
  } elsif ($count > 25) {
    $log->error("Unable to unlock $lock");
    result($stream, "40", "The wiki is locked; try again in a few seconds");
    $stream->close_gracefully();
  } else {
    $log->debug("Waiting $count...");
    Mojo::IOLoop->timer(0.2 => sub {
      with_lock($stream, $host, $space, $code, $count + 1)});
    # don't close the stream
  }
}
 
sub write_page {

lib/App/Phoebe/SpeedBump.pm  view on Meta::CPAN

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
We want to block crawlers that are too fast or that don’t follow the
instructions in robots.txt. We do this by keeping a list of recent visitors: for
every IP number, we remember the timestamps of their last visits. If they make
more than 30 requests in 60s, we block them for an ever increasing amount of
seconds, starting with 60s and doubling every time this happens.
 
For every IP number, Phoebe also records whether the last 30 requests were
“suspicious” or not. A suspicious request is a request that is “disallowed” for
bots according to “robots.txt” (more or less). If 10 requests or more of the
last 30 requests in the last 60 seconds are suspicious, the IP number is
blocked.
 
When an IP number is blocked, it is blocked for 60s, and there’s a 120s
probation time. When you’re blocked, Phoebe responds with a “44” response. This
means: slow down!
 
If the IP number is unblocked but gives cause for another block in the probation
time, it is blocked again and the blocking time is doubled: the IP is blocked
for 120s and there’s 240s probation time. And if it happens again, it is doubled
again.
 
There is no configuration required, but adding a known fingerprint is suggested.
The C</do/speed-bump> URL shows you more information, if you have a client
certificate with a known fingerprint.
 
The exact number of requests and the length of the time window (in seconds) can
be changed in the F<config> file, too.

lib/App/Phoebe/SpeedBump.pm  view on Meta::CPAN

112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
             or $speed_data->{$ip}->{probation} < $now)
        and (not $speed_data->{$ip}->{visits}
             or @{$speed_data->{$ip}->{visits}} == 0
             or $speed_data->{$ip}->{visits}->[0] < $now - $speed_bump_window)) {
      delete($speed_data->{$ip});
    }
  }
  for my $cidr (keys %$speed_cidr_data) {
    delete($speed_cidr_data->{$cidr}) if $speed_cidr_data->{$cidr} < $now;
  }
  # check whether the range is blocked
  my $ip = $stream->handle->peerhost;
  my $ob = new Net::IP($ip);
  for my $cidr (keys %$speed_cidr_data) {
    my $range = new Net::IP($cidr) or $log->error(Net::IP::Error());
    my $overlap = $range->overlaps($ob);
    # $IP_PARTIAL_OVERLAP (ranges overlap) $IP_NO_OVERLAP (no overlap)
    # $IP_A_IN_B_OVERLAP (range2 contains range1) $IP_B_IN_A_OVERLAP (range1
    # contains range2) $IP_IDENTICAL (ranges are identical) undef (problem)
    if (defined $overlap and $overlap != $IP_NO_OVERLAP) {
      $log->info("Net range $cidr is blocked");
      my $delta = $speed_cidr_data->{$cidr} - $now;
      result($stream, "44", "$delta");
      # no more processing
      return 1;
    }
  }
  # check if the ip is currently blocked and extend the block if so
  if (exists $speed_data->{$ip}) {
    my $until = $speed_data->{$ip}->{until};
    if ($until and $until > $now) {
      my $seconds = speed_bump_add($ip, $now);
      $log->info("IP is blocked, extending by $seconds");
      my $delta = $speed_data->{$ip}->{until} - $now;
      result($stream, "44", "$delta");
      # no more processing
      return 1;
    }
  }
  # add a timestamp to the front for the current $ip
  unshift(@{$speed_data->{$ip}->{visits}}, $now);
  # add a warning to the front for the current $ip if the current URL could be a bot
  unshift(@{$speed_data->{$ip}->{warnings}},
          scalar $url =~ m!/(raw|html|diff|history|do/(?:comment|do/(?:all/(?:latest/)?)?changes/|rss|(?:all)?atom|new|more|match|search|index|tag))/!);
  # if there are enough timestamps, pop the last one and see if it falls within
  # the time window; if so, all the requests happened within the time window
  # we're watching
  if (@{$speed_data->{$ip}->{visits}} > $speed_bump_requests) {
    pop(@{$speed_data->{$ip}->{warnings}});
    my $oldest = pop(@{$speed_data->{$ip}->{visits}});
    if ($now < $oldest + $speed_bump_window) {
      my $seconds = speed_bump_add($ip, $now);
      $log->info("Blocked for $seconds because of too many requests");
      result($stream, "44", "$seconds");
      # no more processing
      return 1;
    }
  }
  # even if the browsing speed is ok, we want to block you if you're visiting a
  # lot of URLs that a human would not
  my $warnings = sum(@{$speed_data->{$ip}->{warnings}}) || 0;
  if ($warnings > $speed_bump_requests / 3) {
    my $seconds = speed_bump_add($ip, $now);
    $log->info("Blocked for $seconds because of too many suspicious requests");
    result($stream, "44", "$seconds");
    # no more processing
    return 1;
  }
  # maintenance is done and no block was required, carry on
  return 0;
}
 
sub speed_bump_add {
  my $ip = shift;

lib/App/Phoebe/SpeedBump.pm  view on Meta::CPAN

284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
      $stream->write("The speed bump data has been reset.\n");
      $stream->write("=> /do/speed-bump menu\n") });
    return 1;
  }
  return;
}
 
sub speed_bump_compute_cidr_blocks {
  my %count;
  my %until;
  # check which CIDR has been blocked at least three times
  for my $ip (keys %$speed_data) {
    my $cidr = $speed_data->{$ip}->{cidr};
    next unless $cidr;
    $count{$cidr}++;
    $until{$cidr} ||= $speed_data->{$ip}->{until};
    $until{$cidr} = $speed_data->{$ip}->{until}
      if $speed_data->{$ip}->{until} and $speed_data->{$ip}->{until} > $until{$cidr};
  }
  # only copy the blocked-until timestamp for those CIDRs that were listed at least three times
  for my $cidr (keys %count) {
    next unless $count{$cidr} >= 3;
    speed_bump_add_cidr($cidr, $until{$cidr});
  }
}
 
sub with_speed_bump_fingerprint {
  my $stream = shift;
  my $fun = shift;
  my $fingerprint = $stream->handle->get_fingerprint();

lib/App/Phoebe/SpeedBump.pm  view on Meta::CPAN

352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
  $stream->write("```\n");
  $stream->write("=> /do/speed-bump menu\n");
}
 
sub speed_bump_cidr {
  my $ip = shift;
  my $now = shift;
  my $cidr = $speed_data->{$ip}->{cidr};
  my $until = $speed_data->{$ip}->{until};
  return $cidr if $cidr or not $until or $until - $now < 604800;
  # if blocked for at least 7d and no cidr is available, get it: 7*24*60*60 = 604800
  $ip = new Net::IP ($ip) or return;
  my $reverse = $ip->reverse_ip();
  $reverse =~ s/in-addr\.arpa\.$/asn.routeviews.org/;
  $log->debug("DNS TXT query for $reverse");
  for my $rr (rr($reverse, "TXT")) {
    next unless $rr->type eq "TXT";
    my @data = $rr->txtdata;
    $log->debug("DNS TXT @data");
    $cidr = join("/", @data[1..2]);
    $speed_data->{$ip}->{cidr} = $cidr;

t/BlockFediverse.t  view on Meta::CPAN

23
24
25
26
27
28
29
30
31
32
33
34
35
require './t/test.pl';
 
# variables set by test.pl
our $host;
our $port;
 
like(query_web("GET / HTTP/1.0\r\nhost: $host:$port"),
     qr/^HTTP\/1.1 200 OK/, "Web is served");
 
like(query_web("GET / HTTP/1.0\r\nhost: $host:$port\r\nuser-agent: Mastodon"),
     qr/^HTTP\/1.1 400 Bad Request/, "Mastodon is blocked");
 
done_testing;

t/SpeedBump.t  view on Meta::CPAN

102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
like($page, qr(18\.135\.104\.61), "IP number found");
like($page, qr(CIDR\n.*18\.132\.0\.0/14), "CIDR number found");
 
$page = query_gemini("$base/");
like($page, qr(^20), "Request 1");
 
$page = query_gemini("$base/");
like($page, qr(^20), "Request 2");
 
$page = query_gemini("$base/");
like($page, qr(^44 60), "Request 3 is blocked for 60s");
 
$page = query_gemini("$base/do/speed-bump/status");
#  From    To Warns Block Until Probation IP
#   -5s   -5s  2/ 2   60s   55s      115s 127.0.0.1
like($page, qr(^ +0s +0s +2\/ 2\ +60s +60s +120s +127\.0\.0\.1)m, "Blocked for 60s!");
 
# also making sure all the data from the old JSON file expired
unlike($page, qr(18\.135\.104\.61), "IP number no longer found");
unlike($page, qr(CIDR\n.*18\.132\.0\.0/14), "CIDR number no longer found");
 
done_testing();

t/locks.t  view on Meta::CPAN

30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
Smiling faces float
Tonight in the city park
Phone screens shining bright
EOT
 
my $page = query_gemini("$titan/raw/Haiku;size=74;mime=text/plain;token=hello", $haiku);
like($page, qr/^30 $base\/page\/Haiku\r$/, "Titan Haiku");
 
ok(read_text("$dir/page/Haiku.gmi") eq $haiku, "Haiku saved");
 
mkdir("$dir/locked");
 
# 1s timer
Mojo::IOLoop->timer(1 => sub {
  pass("Waiting 1s");
  ok(read_text("$dir/page/Haiku.gmi") eq $haiku, "Haiku unchanged");
  rmdir("$dir/locked")});
 
my $haiku2 = <<EOT;
Pink peaks and blue rocks
The sun is gone and I'm cold
The Blackbird still sings
EOT
 
# while it waits for the lock to expire, 1s passes and the lock is removed
query_gemini("$titan/raw/Haiku;size=81;mime=text/plain;token=hello", $haiku2);

t/oddmuse-wiki.pl  view on Meta::CPAN

191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
           clear => \&DoClearCache,        debug => \&DoDebug,
           contrib => \&DoContributors,    more => \&DoJournal);
our @MyRules = (\&LinkRules, \&ListRule); # don't set this variable, add to it!
our %RuleOrder = (\&LinkRules => 0, \&ListRule => 0);
 
# The 'main' program, called at the end of this script file (aka. as handler)
sub DoWikiRequest {
  Init();
  DoSurgeProtection();
  if (not $BannedCanRead and UserIsBanned() and not UserIsEditor()) {
    ReportError(T('Reading not allowed: user, ip, or network is blocked.'), '403 FORBIDDEN',
                0, $q->p(ScriptLink('action=password', T('Login'), 'password')));
  }
  DoBrowseRequest();
}
 
sub ReportError {   # fatal!
  my ($errmsg, $status, $log, @html) = @_;
  InitRequest(); # make sure we can report errors before InitRequest
  print GetHttpHeader('text/html', 'nocache', $status), GetHtmlHeader(T('Error')),
    $q->start_div({class=>'error'}), $q->h1(QuoteHtml($errmsg)), @html, $q->end_div,

t/oddmuse-wiki.pl  view on Meta::CPAN

252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
    my $page = ParseData($data); # before InitVariables so GetPageContent won't work
    eval $page->{text} if $page->{text}; # perlcritic dislikes the use of eval here but we really mean it
    $Message .= CGI::p("$ConfigPage: $@") if $@;
  }
}
 
sub InitDirConfig {
  $PageDir     = "$DataDir/page"# Stores page data
  $KeepDir     = "$DataDir/keep"# Stores kept (old) page data
  $TempDir     = "$DataDir/temp"# Temporary files and locks
  $LockDir     = "$TempDir/lock"# DB is locked if this exists
  $NoEditFile  = "$DataDir/noedit"; # Indicates that the site is read-only
  $RcFile      = "$DataDir/rc.log"; # New RecentChanges logfile
  $RcOldFile   = "$DataDir/oldrc.log"; # Old RecentChanges logfile
  $IndexFile   = "$DataDir/pageidx";   # List of all pages
  $VisitorFile = "$DataDir/visitors.log"; # List of recent visitors
  $DeleteFile  = "$DataDir/delete.log"; # Deletion logfile
  $RssDir      = "$DataDir/rss";    # For rss feed cache
  $ConfigFile ||= "$DataDir/config"# Config file with Perl code to execute
  $ModuleDir  ||= "$DataDir/modules"; # For extensions (ending in .pm or .pl)
}

t/oddmuse-wiki.pl  view on Meta::CPAN

727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
  return;
}
 
sub RunMyMacros {
  $_ = shift;
  foreach my $macro (@MyMacros) { $macro->() };
  return $_;
}
 
sub PrintWikiToHTML {
  my ($markup, $is_saving_cache, $revision, $is_locked) = @_;
  my ($blocks, $flags);
  $FootnoteNumber = 0;
  $markup =~ s/$FS//g if $markup# Remove separators (paranoia)
  $markup = QuoteHtml($markup);
  ($blocks, $flags) = ApplyRules($markup, 1, $is_saving_cache, $revision, 'p');
  if ($is_saving_cache and not $revision and $Page{revision} # don't save revision 0 pages
      and $Page{blocks} ne $blocks and $Page{flags} ne $flags) {
    $Page{blocks} = $blocks;
    $Page{flags}  = $flags;
    if ($is_locked or RequestLockDir('main')) { # not fatal!
      SavePage();
      ReleaseLock() unless $is_locked;
    }
  }
}
 
sub DoClearCache {
  return unless UserIsAdminOrError();
  RequestLockOrError();
  print GetHeader('', T('Clear Cache')), $q->start_div({-class=>'content clear'}),
    $q->p(T('Main lock obtained.')), '<p>';
  foreach my $id (AllPagesList()) {

t/oddmuse-wiki.pl  view on Meta::CPAN

2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
}
 
sub PrintPageContent {
  my ($text, $revision, $comment) = @_;
  print $q->start_div({-class=>'content browse', -lang=>GetLanguage($text)});
  # This is a lot like PrintPageHtml except that it also works for older revisions
  if ($revision eq '' and $Page{blocks} and GetParam('cache', $UseCache) > 0) {
    PrintCache();
  } else {
    my $savecache = ($Page{revision} > 0 and $revision eq ''); # new page not cached
    PrintWikiToHTML($text, $savecache, $revision); # unlocked, with anchors, unlocked
  }
  if ($comment) {
    print $q->start_div({-class=>'preview'}), $q->hr();
    print $q->h2(T('Preview:'));
    # no caching, current revision, unlocked
    PrintWikiToHTML(AddComment('', $comment));
    print $q->hr(), $q->h2(T('Preview only, not yet saved')), $q->end_div();
  }
  print $q->end_div();
}
 
sub PrintFooter {
  my ($id, $rev, $comment, $page) = @_;
  if (GetParam('embed', $EmbedWiki)) {
    print $q->end_html, "\n";

t/oddmuse-wiki.pl  view on Meta::CPAN

3132
3133
3134
3135
3136
3137
3138
3139
3140
3141
3142
3143
3144
3145
3146
3147
3148
3149
3150
3151
  my $header;
  if ($revision and not $upload) {
    $header = Ts('Editing revision %s of', $revision) . ' ' . NormalToFree($id);
  } else {
    $header = Ts('Editing %s', NormalToFree($id));
  }
  print GetHeader('', $header), $q->start_div({-class=>'content edit'});
  if ($preview and not $upload) {
    print $q->start_div({-class=>'preview'});
    print $q->h2(T('Preview:'));
    PrintWikiToHTML($oldText); # no caching, current revision, unlocked
    print $q->hr(), $q->h2(T('Preview only, not yet saved')), $q->end_div();
  }
  if ($revision) {
    print $q->strong(Ts('Editing old revision %s.', $revision) . '  '
                     . T('Saving this page will replace the latest revision with this text.'))
  }
  print GetEditForm($id, $upload, $oldText, $revision), $q->end_div();
  PrintFooter($id, 'edit');
}

t/oddmuse-wiki.pl  view on Meta::CPAN

3259
3260
3261
3262
3263
3264
3265
3266
3267
3268
3269
3270
3271
3272
3273
3274
3275
3276
3277
3278
  return 1;
}
 
sub UserCanEditOrDie {
  my $id = shift;
  ValidIdOrDie($id);
  if (not UserCanEdit($id, 1)) {
    my $rule = UserIsBanned();
    if ($rule) {
      ReportError(T('Edit Denied'), '403 FORBIDDEN', undef,
                  $q->p(T('Editing not allowed: user, ip, or network is blocked.')),
                  $q->p(T('Contact the wiki administrator for more information.')),
                  $q->p(Ts('The rule %s matched for you.', $rule) . ' '
                        . Ts('See %s for more information.', GetPageLink($BannedHosts))));
    } else {
      ReportError(T('Edit Denied'), '403 FORBIDDEN', undef,
                  $q->p(Ts('Editing not allowed: %s is read-only.', NormalToFree($id))));
    }
  }
}

t/oddmuse-wiki.pl  view on Meta::CPAN

3427
3428
3429
3430
3431
3432
3433
3434
3435
3436
3437
3438
3439
3440
3441
3442
3443
3444
3445
3446
3447
3448
3449
3450
3451
3452
3453
3454
3455
}
 
sub WriteIndex {
  WriteStringToFile($IndexFile, join(' ', @IndexList));
}
 
sub RefreshIndex {
  @IndexList = ();
  %IndexHash = ();
  # If file exists and cannot be changed, error!
  my $locked = RequestLockDir('index', undef, undef, IsFile($IndexFile));
  foreach (Glob("$PageDir/*.pg"), Glob("$PageDir/.*.pg")) {
    next unless m|/.*/(.+)\.pg$|;
    my $id = $1;
    push(@IndexList, $id);
    $IndexHash{$id} = 1;
  }
  WriteIndex() if $locked;
  ReleaseLockDir('index') if $locked;
}
 
sub AddToIndex {
  my ($id) = @_;
  $IndexHash{$id} = 1;
  @IndexList = sort(keys %IndexHash);
  WriteIndex();
}
 
sub DoSearch {

t/oddmuse-wiki.pl  view on Meta::CPAN

4028
4029
4030
4031
4032
4033
4034
4035
4036
4037
4038
4039
4040
4041
4042
4043
4044
4045
4046
4047
4048
4049
4050
4051
4052
4053
4054
4055
4056
4057
4058
4059
4060
4061
4062
4063
4064
  delete $IndexHash{$id};
  @IndexList = sort(keys %IndexHash);
  return '';      # no error
}
 
sub DoEditLock {
  return unless UserIsAdminOrError();
  print GetHeader('', T('Set or Remove global edit lock'));
  my $fname = "$NoEditFile";
  if (GetParam("set", 1)) {
    WriteStringToFile($fname, 'editing locked.');
  } else {
    Unlink($fname);
  }
  utime time, time, $IndexFile; # touch index file
  print $q->p(IsFile($fname) ? T('Edit lock created.') : T('Edit lock removed.'));
  PrintFooter();
}
 
sub DoPageLock {
  return unless UserIsAdminOrError();
  print GetHeader('', T('Set or Remove page edit lock'));
  my $id = GetParam('id', '');
  ValidIdOrDie($id);
  my $fname = GetLockedPageFile($id);
  if (GetParam('set', 1)) {
    WriteStringToFile($fname, 'editing locked.');
  } else {
    Unlink($fname);
  }
  utime time, time, $IndexFile; # touch index file
  print $q->p(IsFile($fname) ? Ts('Lock for %s created.', GetPageLink($id))
              : Ts('Lock for %s removed.', GetPageLink($id)));
  PrintFooter();
}
 
sub DoShowVersion {



( run in 0.513 second using v1.01-cache-2.11-cpan-a9ef4e587e4 )